Technical Committee ATM Forum Performance Testing Specification AF-TEST-TM-0131.000 October, 1999 (c) 1999 by The ATM Forum. The ATM Forum hereby grants its members the limited right to reproduce in whole, but not in part, this specification for its members internal use only and not for further distribution. This right shall not be, and is not, transferable. All other rights reserved. Except as expressly stated in this notice, no part of this document may be reproduced or transmitted in any form or by any means, or stored in any information storage and retrieval system, without the prior written permission of The ATM Forum. The information in this publication is believed to be accurate as of its publication date. Such information is subject to change without notice and The ATM Forum is not responsible for any errors. The ATM Forum does not assume any responsibility to update or correct any information in this publication. Notwithstanding anything to the contrary, neither The ATM Forum nor the publisher make any representation or warranty, expressed or implied, concerning the completeness, accuracy, or applicability of any information contained in this publication. No liability of any kind shall be assumed by The ATM Forum or the publisher as a result of reliance upon any information contained in this publication. The receipt or any use of this document or its contents does not in any way create by implication or otherwise: • Any express or implied license or right to or under any ATM Forum member company's patent, copyright, trademark or trade secret rights which are or may be associated with the ideas, techniques, concepts or expressions contained herein; nor • Any warranty or representation that any ATM Forum member companies will announce any product(s) and/or service(s) related thereto, or if such announcements are made, that such announced product(s) and/or service(s) embody any or all of the ideas, technologies, or concepts contained herein; nor • Any form of relationship between any ATM Forum member companies and the recipient or user of this document. Implementation or use of specific ATM standards or recommendations and ATM Forum specifications will be voluntary, and no company shall agree or be obliged to implement them by virtue of participation in The ATM Forum. The ATM Forum is a non-profit international organization accelerating industry cooperation on ATM technology. The ATM Forum does not, expressly or otherwise, endorse or promote any specific products or services. NOTE: The user's attention is called to the possibility that implementation of the ATM interoperability specification contained herein may require use of an invention covered by patent rights held by ATM Forum Member companies or others. By publication of this ATM interoperability specification, no position is taken by The ATM Forum with respect to validity of any patent claims or of any patent rights related thereto or the ability to obtain the license to use such rights. ATM Forum Member companies agree to grant licenses under the relevant patents they own on reasonable and nondiscriminatory terms and conditions to applicants desiring to obtain such a license. For additional information contact: The ATM Forum Worldwide Headquarters 2570 West El Camino Real, Suite 304 Mountain View, CA 94040-1313 Tel: +1-650-949-6700 Fax: +1-650-949-6705 Acknowledgement Much work went into the development of this specification. It could not have been completed without the ATM Forum contributions and the participation of many people in the TEST and TM working groups. In particular, the Editor would like to recognize the following members who made significant contributions. John Adams Mustapha Aissaoui Gojko Babic Abdella Battou J. Belanger Arnold Bragg Walter Buehler Norman Carder Tom Chen John Clark Leslie Collica Gregan Crawford (TEST-chair) Siamak Dastangoo Emmanuel Desmet Arjan Durresi Rod Ferguson G. Garner Natalie Giroux (TM-chair) Kenneth Green Kenneth C. Glossbrenner Kent Headrick Ivy Hsu Werner Hug Doug Hunt Raj Jain (former Editor) Sungwon Kang Deepak Kataria Dinyar Kavouspour Hyoung-soo Kim Kuo-Hui Liu Brian McBride Ken McInerney Randy Mitchell B. Morris Raj Nair Bruce Northcote Deane Osbourne Rick Raskin Andreas Schubert Nancy Schult Piergiorgio Vittori Tom Worster The assistance by these members and the many others that contributed to getting this work started and participated in the TEST and TM working groups is greatly appreciated. Fred Kaudel, Editor. Table of Contents 1. INTRODUCTION 1 1.1. SCOPE 1 1.2. GOALS OF PERFORMANCE TESTING 2 1.3. NON-GOALS OF PERFORMANCE TESTING 3 1.4. TERMINOLOGY 3 1.5. ABBREVIATIONS 5 2. OVERVIEW OF PERFORMANCE TESTING 7 2.1. PERFORMANCE TESTING ABOVE THE ATM LAYER 7 2.2. PERFORMANCE TESTING AT THE ATM LAYER 8 2.3. REQUIREMENTS FOR PERFORMANCE TESTING 9 3. PERFORMANCE TESTING METHODOLOGY 9 3.1. ATM SPECIFIC ISSUES 9 3.2. MEASUREMENT POINTS AND REFERENCE EVENTS 10 3.3. REFERENCE LOADS 13 3.3.1. Basic Framework of a Frame Reference Load Model 13 3.3.2. Frame Source Parameters 14 3.3.3. Multiplexer Parameters 16 3.3.4. Definition of a Frame RLM 16 3.3.5. Example of the Definition of a Frame RLM 16 3.4. TEST CONFIGURATIONS 18 3.4.1. Foreground Traffic 21 3.4.2. Background Traffic 21 3.5. GENERAL MEASUREMENT PROCEDURES 23 3.6. STATISTICAL VARIATIONS 23 3.7. OPTIONAL TRAFFIC MANAGEMENT FUNCTIONS AND PROCEDURES 24 3.8. REPORTING RESULTS 24 4. PERFORMANCE METRICS 24 4.1. THROUGHPUT 24 4.1.1. Definitions 24 4.1.2. Units 25 4.1.3. Measurement Procedures 25 4.2. FRAME LATENCY 26 4.2.1. Definition 26 4.2.2. Units 27 4.2.3. Measurement Procedures 27 4.2.4. Burst Model Measurement Procedures 28 4.3. FAIRNESS INDEX 29 4.3.1. Definition 29 4.3.2. Units 30 4.3.3. Measurement Procedures 30 4.4. FRAME LOSS RATIO 30 4.4.1. Definition 30 4.4.2. Units 31 4.4.3. Measurement Procedures 31 4.5. MAXIMUM FRAME BURST SIZE (MFBS) 32 4.5.1. Definition 32 4.5.2. Units 32 4.5.3. Measurement Procedures 32 5. REFERENCES 33 APPENDIX A: DEFINING FRAME LATENCY ON ATM NETWORKS 34 A.1. INTRODUCTION 34 A.2. USUAL FRAME LATENCIES AS METRICS FOR ATM SWITCH DELAY 36 A.2.1. LIFO Latency 36 A.2.2. FIFO Latency 37 A.2.3. FILO Latency 39 A.3. MIMO LATENCY DEFINITION 39 A.4. CELL AND CONTIGUOUS FRAME LATENCY THROUGH A ZERO-DELAY SWITCH 39 A.5. LATENCY OF DISCONTINUOUS FRAMES PASSING THROUGH A ZERO-DELAY SWITCH 43 A.6. CALCULATION OF FILO LATENCY FOR A ZERO-DELAY SWITCH 45 A.7. EQUIVALENT MIMO LATENCY DEFINITION 46 A.8. AN ALTERNATIVE DEFINITION OF MIMO 46 A.9. MIMO LATENCY OF A PATH 48 A.10. MEASURING MIMO LATENCY 50 A.11. USER PERCEIVED DELAY 51 A.12. OTHER DELAY METRICS 53 APPENDIX B: METHODOLOGY FOR IMPLEMENTING SCALABLE TEST CONFIGURATIONS 54 B.1. INTRODUCTION 54 B.2. PARALLEL TRAFFIC REPLICATION 55 B.3. SERIAL TRAFFIC REPLICATION 55 B.3.1. Implementation of External Connections 57 B.3.2. Implementation of Internal Connections 59 B.3.3. Background Traffic 60 B.3.4. Examples of Scalable Connection Configurations 60 B.3.4.1. n-to-n Straight (Single Generator) 60 B.3.4.2. n-to-n Straight (r Generators) 61 B.3.4.3. n-to-m Partial Cross (r Generators) 62 APPENDIX C: STATISTICAL ANALYSIS OF ATM PERFORMANCE PARAMETERS 64 C.1. CONFIDENCE INTERVAL ESTIMATION FOR LARGE SAMPLE SIZES: ASYMPTOTIC 64 C.2. ACCURACY OF CONFIDENCE INTERVAL ESTIMATION FOR SMALL SAMPLE SIZES 65 C.2.1. Confidence Intervals Based on Asymptotic Normality 65 C.2.2. Confidence Intervals Based on Bootstrap 65 C.2.2.1. Bootstrap Steps for Calculating a Confidence Interval for the Mean 67 C.3. ACCURACY OF CONFIDENCE INTERVAL ESTIMATION FOR NON-I.I.D. DATA 71 C.3.1. Confidence Intervals for the Variance 71 C.3.1.1. Jackknife Steps for Calculating a Variance Estimate and Confidence Interval 73 C.3.2. Generalized Jackknife 76 APPENDIX D: IN-SERVICE AND OUT-OF-SERVICE MEASUREMENT OF QOS PARAMETERS 78 D.1. INTRODUCTION 78 D.2. OUT-OF-SERVICE MEASUREMENT OF QOS PARAMETERS 78 D.2.1. Introduction 78 D.2.2. Test Cell Format 78 D.2.3. Test Configuration 79 D.2.3.1. Test Cell Input 80 D.2.3.2. Test Cell Output 80 D.2.4. Analysis of the Test Cell Stream 81 D.2.4.1. Cell Error Ratio (CER) 81 D.2.4.2. Cell Loss Ratio (CLR) 82 D.2.4.3. Cell Misinsertion Rate (CMR) 82 D.2.4.4. Cell Missequenced Ratio (CSR) 82 D.2.4.5. Measuring Cell Transfer Times 83 D.2.4.6. Mean Cell Transfer Delay (MCTD) 83 D.2.4.7. Maximum Cell Transfer Delay 83 D.2.4.8. Peak-to-peak Two-point Cell Delay Variation 84 D.2.5 AAL-5 Test Frame Format 84 D.2.5.1. AAL-5 Test Frame Test Cell Format 85 D.2.5.2. Frame Sequence Number 87 D.2.5.3. Frame Test Cell Sequence Number 87 D.2.5.4. Payload Type 87 D.3. IN-SERVICE MEASUREMENT OF QOS PARAMETERS 87 D.3.1. Introduction 87 D.3.2. ATM Layer OAM Flows 87 D.3.3. Performance Management Procedures 87 D.3.4. OAM Performance Management Cell 88 D.3.5. Out-of-service Use of Performance Management OAM 89 D.3.6. Benefits of Performance OAM Technique 90 D.3.7. ATM Test Cell 90 APPENDIX E: EXAMPLES OF FRAME REFERENCE LOAD MODELS 91 E.1. INTRODUCTION 91 E.2. PSEUDO-CODE REPRESENTATION OF AN ABSTRACT TEST 91 E.3. EXAMPLES 92 E.3.1. Simple Single-cell Test 92 E.3.2. Constant Frame Size, Constant Inter-frame Gap 92 E.3.3. Constant Frame Size, Variable Inter-frame Gap 92 E.3.4. Variable Frame Size, Constant Inter-frame Gap 93 E.3.5. Variable Frame Size, Variable Inter-frame Gap 93 E.3.6. Persistent (Greedy) Source 93 E.3.7. Staggered Persistent Source 93 E.3.8. Variable Load Source 93 E.3.9. Bursty Source 93 E.3.10. Three-state Traffic Source 94 APPENDIX F: EXAMPLES FOR REPORTING TEST RESULTS 95 APPENDIX G: SIMPLE STATISTICAL METHODS RECIPE 96 G.1. CONFIDENCE INTERVAL ESTIMATION 96 G.1.1. Purpose 96 G.1.2. Confidence Interval for the Mean 96 G.1.2.1. Conditions 96 G.1.2.2. Confidence Interval 97 G.1.2.3. Walkthrough 97 G.1.3. Confidence Interval Using the Bootstrap Method 97 G.1.3.1. Purpose 97 G.1.3.2. Conditions 97 G.1.3.3. Walkthrough 98 G.2. BATCH MEANS 98 G.2.1. Purpose 98 G.2.2. Condition 98 G.2.3. Walkthrough 99 G.2.4. Discussion 99 G.2.4.1. How to Choose b and k? 99 1. Introduction Performance testing in ATM deals with the measurement of the level of quality of a System Under Test (SUT) or an Implementation Under Test (IUT) under well-known conditions. The level of quality can be expressed in the form of metrics such as latency, end-to-end delay, and effective throughput. Performance testing can be carried at the end-user application level (e.g., FTP, NFS), at or above the ATM layers (e.g., cell switching, signalling). Performance testing also describes in details the procedures for testing the IUTs in the form of test suites. These procedures are intended to test the SUT or IUT and do not assume or imply any specific implementation or architecture of these systems. This document highlights the objectives of performance testing and suggests an approach for the development of the test suites. Several test cases are embedded into measurement procedures of the different metrics. This specification is not intended to be a complete test suite. 1.1. Scope Asynchronous Transfer Mode, as an enabling technology for the integration of services, is gaining an increasing interest and popularity. ATM networks are being progressively deployed and in most cases a smooth migration to ATM is prescribed. This means that most of the existing applications can still operate over ATM via service emulation or service interworking along with the proper adaptation of data formats. At the same time, several new applications are being developed to take full advantage of the capabilities of the ATM technology through an Application Programming Interface (API). While ATM provides an elegant solution to the integration of services and allows for high levels of scalability, the performance of a given application may vary substantially with the IUT or the SUT utilized. The variation in the performance is due to the complexity of the dynamic interaction between the different layers. For example, an application running with TCP/IP stacks will yield different levels of performance depending on the interaction of the TCP window flow control mechanism and the ATM network congestion control mechanism used. Hence, the following points and recommendations are made. First, ATM adopters need guidelines on the measurement of the performance of user applications over different systems. Second, some functions above the ATM layer, e.g., adaptation, signalling, constitute applications (i.e., IUTs) and as such should be considered for performance testing. Also, it is essential that these layers be implemented in compliance with the ATM Forum specifications. Third, performance testing can be executed at the ATM layer in relation to the QoS provided by the different service categories. Finally, because of the extensive list of available applications, it is preferable to group applications in generic classes. Each class of applications requires different testing environment such as metrics, test suites and traffic test patterns. Note that the same application, e.g., ftp, can yield different performance results depending on the underlying layers used (TCP/IP to ATM versus TCP/IP to MAC layer to ATM). Thus, performance results should be compared based on the utilization of the same protocol stack. Performance testing is related to user perceived performance of ATM technology. In other words, goodness of ATM will be measured not only by cell-level performance but also by frame-level performance and performance perceived at higher layers. Most of the quality of Service (QoS) metrics, such as cell transfer delay (CTD), cell delay variation (CDV), cell loss ratio (CLR), and so on, may or may not be reflected directly in the performance perceived by the user. For example, while comparing two switches if one gives a CLR of 0.1% and a frame loss ratio of 0.1% while the other gives a CLR 1% but a frame loss of 0.05%, the second switch will be considered superior by many users. The ATM Forum and ITU-T have standardized the definitions of ATM layer QoS metrics and their measurement [6, 2, 1, 9]. This specification does the same for adaptation layer performance metrics. Without a standard definition, each vendor will use their own definition of common metrics such as throughput and latency resulting in confusion in the market place. Avoiding such confusion will help buyers, eventually lead to better sales and result in the success of ATM technology. 1.2. Goals of Performance Testing The goal of this effort is to enhance the marketability of ATM technology and equipment. Any additional criteria that help in achieving that goal can be added later to this list. a. The ATM Forum shall define metrics that will help compare various ATM equipment in terms of performance. b. The metrics shall be such that they are independent of switch or NIC architecture. The same metrics shall apply to all architectures. c. The metrics can be used to help predict the performance of an application or to design a network configuration to meet specific performance objectives. d. The ATM Forum will develop a precise methodology for measuring these metrics. The methodology will include a set of configurations and traffic patterns that will allow vendors as well as users to conduct their own measurements. e. The testing shall cover all classes of service including CBR, rt-VBR, nrt-VBR, UBR, ABR, and GFR. f. The metrics and methodology for different service categories may be different. g. The testing should cover as many protocol stacks and ATM services as possible. As an example, measurements for verifying the performance of services such as IP, Frame Relay and SMDS over ATM may be included. h. The following objectives are set for ATM performance testing: (i) Definition of criteria to be used to distinguish classes of applications. (ii) Definition of classes of applications, at or above the ATM Layer, for which performance metrics are to be provided. (iii) Identification of the functions at or above the ATM Layer which influence the perceived performance of a given class of applications. Examples of such functions include traffic shaping, quality of service, and adaptation. These functions need to be measured in order to assess the performance of the applications within that class. (iv) Definition of common performance metrics for the assessment of the performance of all applications within a class. The metrics should reflect the effect of the functions identified in (iii). i. The scope of this first revision of the ATM Forum Performance Testing Specification is limited to AAL-5. 1.3. Non-Goals of Performance Testing a. The ATM Forum is not responsible for conducting any measurements. b. The ATM Forum will not: (i) certify measurements. (ii) evaluate or assess results obtained by companies or other bodies. (iii) certify bodies conducting measurements. c. The ATM Forum will not set thresholds such that equipment performing below those thresholds are called "unsatisfactory." d. The ATM Forum neither performs nor specifies benchmark testing (see [1]). e. The ATM Forum will not establish any requirement that dictates a cost versus performance ratio. f. Applications whose performance cannot be assessed by common implementation independent metrics are excluded from the scope of ATM performance testing. In this case, performance is tightly related to the implementation. An example of such applications is network management, whose performance depends on whether it is a centralized or a distributed implementation. 1.4. Terminology The following definitions are used in this document: * Activity: A phase consists of one or more active periods, separated by idle periods (inter-activity gaps). * ATM Analyzer: ATM measuring equipment used to measure the characteristics of traffic received from the SUT. * ATM Connection: An ATM connection consists of the concatenation of ATM layer links in order to provide an end-to-end transfer capability to the access point (or T reference point) - from ITU-T Recommendation I.150 [3]. * ATM Generator: ATM measuring equipment used to produce traffic with specified characteristics. * ATM Interface: The ATM physical interface where ATM traffic enters and/or leaves the SUT. * Background Connection: An ATM connection that carries background traffic. * Background Traffic: Traffic made up of cells whose purpose is to load the SUT at an appropriate level but the performance of these cells is not of primary interest. * Connection Load: Precisely defined specification of a pattern of ATM cells over a single connection. When cells are parts of frames, a connection load may be defined in term frame pattern and inter-frame gaps distribution. * Foreground Connection: An ATM connection that carries foreground traffic. * Foreground Traffic: Traffic made up of cells or frames whose performance is being measured. * Frame: A sequence of cells corresponding to an AAL-5 PDU (delineated by setting the AUU bit to 1 in the last cell of the frame). The AUU bit is the LSB in the three-bit PTI code point 0xx (user data cell, as defined in ITU-T Recommendation I.361 [8]). * Frame Pattern: Precisely defined pattern of cells and inter-cell gaps, inter-frame gaps, and frame sizes. A frame pattern can be defined statistically in terms of distributions of frame sizes, inter-frame gaps, and inter-cell gaps. * Frame Size: The number of cells in a frame. * Implementation Under Test (IUT): The part of the system that is to be tested. * Input ATM Interface Rate: Maximum nominal rate in cells/second at which the cells can be received at the interface. * Interface Load: Set of Connection Loads applied to an input ATM interface. * Interval: An active period consists of one or more characteristic intervals, separated by inter-interval gaps. * Inter-activity Gap: The number of idle cells between the last interval of one activity and the first interval of the next activity. * Inter-cell Gap: The time between transmission of the last bit of one cell and transmission of the first bit of the next cell of the same frame. * Inter-frame Gap: The time between transmission of the last bit of the last cell one frame and transmission of the first bit of the first cell of the next frame. * Inter-interval Gap: The number of cells between transmission of the last frame of one interval and the first frame of the next interval. * Inter-phase Gap: The number of cells between the last activity of one phase and the first activity of the next phase. * Loopback: An external connection that connects the output and input of the same ATM interface. * Measurement Point: A measurement point is located at an interface that separates either customer equipment/customer network or a Switching/Signalling Node from an attached transmission system at which protocols can be observed [5]. * Metric: A quantitative measure of the goodness of overall service offered by an SUT, for example, throughput, throughput fairness. It reflects quantitatively the response or the behavior of an IUT or an SUT. * Monitor: ATM measuring equipment used to assess transfer performance of an SUT [9]. * Network Module: A group of switch ATM interfaces that physically reside on a single card. * Output ATM Interface Rate: The maximum nominal rate in cells/second at which the cells can be transmitted from the interface. * Parameter: A quantitative measure of the goodness of services received by a connection, for example, cell transfer delay, cell delay variation. * Phase: A test consists of one or more test phases, separated by inter-phase gaps. * Port: Same as ATM Interface. * Reference Load: The set of Interface Loads applied to an SUT. * Scalable Configuration: A configuration that permits loading the SUT using a minimal number of ATM monitors. * Switch Fabric: The switch component whose main function is to transfer cells among various interfaces of the switch. * System Under Test (SUT): Any collection of ATM equipment that is being tested. It could be a single switch or a network of ATM switches. It includes the IUT. * Test Case: A series of test steps needed to put an IUT into a given state to observe and describe its behavior. * Test Suite: A complete set of test cases, possibly combined into nested test groups, that is necessary to perform testing for an IUT or a protocol within an IUT. * Traffic Load: Connection Load, Interface Load or Reference Load. * Wire: The physical medium used for external connections of the SUT's ATM interfaces. The medium could be copper cables, optical fibers, or wireless links. 1.5. Abbreviations AAL ATM Adaptation Layer ABR Available Bit Rate AG Application Goodput API Application Programming Interface ATM Asynchronous Transfer Mode AUU ATM User to User BEDC Block Error Detection Code BIP Bit Interleaved Parity BLER BLock Error Result BR Burst Responsiveness CAC Connection Admission Control CBR Constant Bit Rate CDV Cell Delay Variation CDVT Cell Delay Variation Tolerance CER Cell Error Ratio CIT Cell Input Time CLP Cell Loss Priority CLR Cell Loss Ratio CMR Cell Misinsertion Rate COT Cell Output Time CRC Cyclic Redundancy Check CRE Cell-level Reference Event CS Convergence Sublayer CSR Cell misSequenced Ratio CTD Cell Transfer Delay EDC Error Detection Code EOF End of Frame FFL Full Foreground Load FIFO First In, First Out FILO First In, Last Out FLR Frame Loss Ratio FRE Frame-level Reference Event FSN Frame Sequence Number FTP File Transfer Protocol GFR Guaranteed Frame Rate HEC Header Error Control HW Hardware ICG Inter-Cell Gap IETF Internet Engineering Task Force IFG Inter-Frame Gap iid independent, identically distributed IP Internet Protocol IPR Impaired PDU Ratio IRE Internal Reference Event ISDN Integrated Services Digital Network ISO International Organization for Standardization ITU-T International Telecommunication Union - Telecommunication standardization sector IUT Implementation Under Test LAN Local Area Network LIFO Last In, First Out LILO Last In, Last Out LSB Least Significant Bit MAC Media Access Control MaxCTD Maximum Cell Transfer Delay MBL Maximum Background Load MBS Maximum Burst Size MCBS Maximum Cell Burst Size MCR Minimum Cell Rate MCSN Monitoring Cell Sequence Number MCTD Mean Cell Transfer Delay MFBS Maximum Frame Burst Size MFL Maximum Foreground Load MFS Maximum Frame Size MIMO Message In, Message Out MinCTD Minimum Cell Transfer Delay MP Measurement Point MSB Most Significant Bit MTU Maximum Transmission Unit NFS Network File System NIC Network Interface Card NP Network Performance NPC Network Parameter Control OAM Operations and Maintenance PCR Peak Cell Rate PDU Protocol Data Unit PPI Proprietary Payload Indicator PRBS Pseudo Random Bit Sequence PTI Payload Type Indicator PVC Permanent Virtual Circuit QoS Quality of Service RL Reference Load RLM Reference Load Model SAP Service Access Point SAR Segmentation And Reassembly SCR Sustained Cell Rate SECBR Severely-Errored Cell Block Ratio SMDS Switched Multi-Megabit Data Service SUT System Under Test SVC Switched Virtual Circuit SW Software TCP Transport Control Protocol TCPT Test Cell Payload Type TMN Telecommunications Management Network TRCC Total Received Cell Count TUC Total User Cell UBR Unspecified Bit Rate UN UNspecified bytes UNI User Network Interface UPC Usage Parameter Control UTP Unshielded Twisted Pair nrt-VBR non-real-time Variable Bit Rate rt-VBR real-time Variable Bit Rate VC Virtual Circuit VCI Virtual Channel Identifier VCC Virtual Channel Connection VP Virtual Path VPC Virtual Path Connection VPI Virtual Path Identifier WAN Wide Area Network WG Working Group 2. Overview of Performance Testing Applications should be grouped to simplify their testing. Applications with similar performance requirements can be grouped together. This simplifies the testing of new applications, as they most likely fit into an existing grouping, for which the performance metrics and test procedures of this specification can be used. This section provides overviews of performance testing above the ATM layer, performance testing at the ATM layer, and requirements for performance testing. 2.1. Performance Testing Above the ATM Layer Performance metrics can be measured at the user application layer, and sometimes at the transport layer and the network layer, and can give an accurate assessment of the perceived performance. The perceived performance of a user application running over an ATM network is dependent on many parameters. It can vary substantially by changing an underlying protocol stack, the ATM service category it uses, the congestion control mechanism used in the ATM network, etc. Furthermore, there is no direct and unique relationship between the ATM Layer Quality of Service (QoS) parameters and the perceived application performance. For example, in an ATM network implementing a packet-level discard congestion mechanism, applications using TCP as the transport protocol may see their effective throughput improved while the measured cell loss ratio may be relatively high. In practice, it is difficult to carry out measurements in all the layers that span the region between the ATM Layer and the user application layer given the inaccessibility of testing points. More effort needs to be invested to define the performance at these layers. These layers include adaptation, and signalling. This specification applies to AAL-5. 2.2. Performance Testing at the ATM Layer The notion of application at the ATM Layer is related to the service categories provided by the ATM service architecture. The Traffic Management Specification, Version 4.1 [2] specifies six service categories: CBR, rt-VBR, nrt-VBR, UBR, ABR, and GFR. Each service category defines a relation of the traffic characteristics and the Quality of Service (QoS) requirements to network behavior. There is an assessment criteria of the QoS associated with each of these parameters. These are summarized below. QoS PERFORMANCE PARAMETER QoS ASSESSMENT CRITERIA Cell Error Ratio Accuracy Severely-Errored Cell Block Ratio Accuracy Cell Misinsertion Ratio Accuracy Cell Loss Rate Dependability Cell Transfer Delay Latency Cell Delay Variation Accuracy Section 5.6 of ITU-T Recommendation I.356 [6] further defines the Severely-Errored Cell Block Ratio. Performance testing at the ATM Layer deals with in-service and out-of-service measurements of the QoS parameters for all six service categories (or application classes in the context of performance testing): CBR, rt-VBR, nrt-VBR, UBR, ABR and GFR. The following cases are covered: (i) Performance of the SUT under non-overloaded conditions. (ii) Performance of the SUT under overload conditions. In this case, the efficiency of the congestion avoidance and congestion control mechanisms of the SUT are tested. Appendix D defines methods and configurations for both in-service and out-of-service measurements. The in-service mode uses OAM cells, while the out-of-service mode defines the payloads to be used for test cells on connections running out-of-service measurements. Measurement methods for both in-service and out-of-service modes are also defined in the ITU-T Recommendation O.191 [9] whereas in-service performance monitoring procedures are specified in ITU-T Recommendation I.610 [12] on OAM principles and functions. ATM Forum specification [1] also defines out-of-service measurement of several QoS parameters. However, detailed test cases and procedures, as well as test configurations are needed for both in-service and out-of-service measurement of QoS parameters. 2.3. Requirements for Performance Testing To provide common performance metrics that are applicable to a wide range of SUT's and that can be uniquely interpreted, the following requirements must be satisfied: (i) Reference load models for the six service categories CBR, rt-VBR, nrt-VBR, UBR, ABR, and GFR, and for AAL-5 frame layer are required. Candidate Reference Load Models (RLMs) shall meet the following criteria: * The RLM should embody the relevant characteristics of the actual traffic. This means that the System Under Test (SUT) should react similarly to RLM or actual traffic with respect to the important metrics (e.g., buffer characteristics, delays, cell loss rate). * The algorithm for generating a RLM should be precisely defined and feasible to implement. * The RLM should be scaleable (i.e., the traffic volume can be increased or decreased in a straightforward manner). The traffic generated by the RLM should be reproducible. Reference load models (also referred to as test traffic profiles) for cell transfer performance measurement of CBR and VBR service categories are defined in ITU-T Recommendation O.191 [9]. Reference load models for other applications should be considered. (ii) Test cases and configurations must not assume or imply any specific implementation or architecture of the SUT. 3. Performance Testing Methodology In the following description, Implementation Under Test (IUT) refers to an ATM switch. However, the definitions and measurement procedures are general and may be used for other devices or a network consisting of multiple switches as well. Section 3.1 describes some of the issues when performance testing ATM devices. Section 3.2 defines the Measurement Points between which the Reference Loads, described in Section 3.3, are applied. Section 3.4 describes why a System Under Test (SUT) should always refer to the IUT plus any specific connection configuration that is used to generate the Reference Load. Section 3.5 discusses general measurement procedures. Section 3.6 considers statistical variations. Section 3.7 highlights optional traffic management functions and procedures. Section 3.8 describes result reporting. 3.1. ATM Specific Issues ATM is a sophisticated communication technology providing Quality of Service (QoS) for users. This is different than the traditional best effort services, and more sophisticated performance testing is required. Some of the issues when performance testing ATM devices are: 1. Connections have a specified set of traffic and quality of service parameters. These parameters are defined so that a connection can obtain some guaranteed performance or QOS. If the connection is not conformant to its traffic parameters, it is deemed non-conformant and can be subject to Usage Parameter Control. The UPC function may tag or discard cells, thereby affecting performance. Therefore it is imperative that test connections are defined so that they can handle the traffic sources characteristics. Otherwise the performance reported will not be accurate, since it could be simply deemed as non-conforming. 2. The allowable connections on a SUT are determined by a connection admission control (CAC) system. There are generally no performance guarantees on connections that would be rejected by the SUT CAC. Therefore, it is important that only connections allowed by the CAC are tested. 3. ATM supports multiple virtual connections (VC). The performance of an SUT with a single VC generally does not extrapolate to the performance of the SUT with multiple VCs. 4. Many SUTs have proprietary congestion control schemes that can be used to improve performance under certain conditions. The tester should use these performance enhancing schemes if applicable when testing the SUT, and should list them in the test report. 5. Cells can be marked as CLP 1 or CLP 0. The CLP settings on the traffic source can have a significant impact on the performance of the SUT. 6. ATM SUTs generally have significant gains from statistical multiplexing. However it is hard to test the performance gains from statistical multiplexing without a large number of test generators and sophisticated test sources. The scaleable test configurations provided in this specification may not show any performance gains from statistical multiplexing. 3.2. Measurement Points and Reference Events Following the definition of cell reference events [2], a similar definition for frame reference events is provided. Moreover, it is illustrated that the physical points where the events are measured are the same for both cell events and frame events. A frame is defined as a sequence of cells corresponding to an AAL-5 PDU (delineated by setting the AUU bit to 1 in the last cell of the frame). The AUU bit is the LSB in the three-bit PTI code point 0xx (user data cell, as defined in ITU-T Recommendation I.361 [8]). To describe frame-level performance parameters consistent with the general structure in ITU-T Recommendation I.350 [4], it is necessary to define frame-level reference events that are observable at measurement points in a network and then define relevant performance parameters based on these reference events. The measurement points are defined at physical locations in a network. The frame-level reference events are observable at these locations with suitable test sets; that is, the reference event definitions are based on physical layer test access. Figure 3.1 shows two such frame-level reference events that are labeled FRE1 and FRE2, and that are observable with suitable test equipment at the measurement points labeled MP1 and MP2, respectively. The SUT is tested either out-of-service in its network, or in a laboratory. Figure 3.1: Measurement Points and Reference Events CRE1 and CRE2 are cell-level reference events observable at MP1 and MP2 in Figure 3.1. Cell entry and cell exit events are cell reference events [2]. The AAL-5 layer is the natural place to consider defining reference events at the frame-level. Figure 3.2 illustrates two such frame-level reference events defined inside the AAL-5 layer of each full protocol stack. The protocol stacks illustrated are in the test sets at MP1 and MP2. Figure 3.2: Frame-Level Internal and External Reference Events and Measurement Points Reference events occur as relevant PDUs move across the protocol layer interface between the AAL-5 Convergence Sublayer (CS) and the AAL-5 Segmentation And Reassembly (SAR) sublayer. With respect to the indicated direction of transmission, these are labeled AAL-5 Internal Reference Event 1 (IRE1) and AAL-5 Internal Reference Event 2 (IRE2), respectively. Ideally, the IREs are the reference events one would like to measure to determine frame-level performance. However, since the IREs reference interfaces within the protocol stack, they are generally not physically accessible with test sets. Each FRE is defined to approximate an IRE, and is observable at a MP with a suitable test set (i.e., the FRE is defined based on physical layer test access). The information content of FRE1 is, for practical purposes, the same as that of the corresponding IRE1 that generated it. The time of occurrence of FRE1, T1, lags behind the occurrence of IRE1 by a small and quantifiable amount. Similarly, the time of occurrence of FRE2, T2, leads the occurrence of IRE2 by a small and quantifiable amount. Consistent with the structure provided by ITU-T Recommendation I.350 [4] requiring that performance parameters be defined in terms of performance-significant reference events that are observable at MPs, FRE1 and FRE2 shown in Figure 3.1 fulfill the requirement of observability at network measurement points MP1 and MP2 (located on the network side of these AAL-5 SAPs, and as close as practical to them), but IRE1 and IRE2 shown in Figure 3.2 do not. Using the definitions of cell entry and exit events in [2], the following definitions for two frame-level reference events are proposed: * Frame-level Reference exit Event (shown as FRE1): the occurrence of the cell exit event for the first user data cell of the frame; * Frame-level Reference entry Event (shown as FRE2): the occurrence of the cell entry event for the last user data cell of the frame. Observation of these frame-level reference events is possible because the MPs are located in a network. As shown in Figure 3.1, MP1 is located near the transmitting equipment and MP2 is located near the receiving equipment. Test equipment at MP1 and MP2 would reconstruct the frames. Since each of these MPs is located near an ATM SAP, the cell transfer reference events CRE1 and CRE2 as defined in ITU-T Recommendation I.353 [5] are also observable, and hence, these MPs can be used to measure ATM cell transfer performance parameters. This coincidence of MPs for frame-level performance and for ATM cell transfer performance simplifies identifying the relations between these two types of performance parameters. The frame-level performance parameters can be defined based on the above frame-level reference events. Following the approach of ITU-T Recommendation I.356 [6], appropriate frame transfer outcomes are defined based upon the occurrence of appropriate frame-level reference entry (FRE2) events at an MP2 near receiving equipment that corresponds to frame-level reference exit (FRE1) events at an MP1 near transmitting equipment. Two frame-level reference events correspond if they are created by the same frame. Again, in parallel with ITU-T Recommendation I.356 [6], a frame transfer outcome is defined as the occurrence at an MP2 of a FRE2 corresponding to the occurrence at an MP1 of a FRE1, within a specified time Tmax. A frame transfer outcome would generally be further classified by certain criteria, such as whether or not the user information bits in the FRE2 match the user information bits in the corresponding FRE1. Consistent with ITU-T draft Recommendation X.144 Amendment 1 Annex C [14], this approach will be demonstrated by applying it in sections 4.5.1 and 4.3.1 to develop proposed definitions for user information loss and user information delay performance parameters associated with frame transfer outcomes, respectively. 3.3. Reference Loads A prerequisite for successful and repeatable performance testing is the definition of standard Reference Load Models (RLMs). RLMs are used to characterize test traffic in a well-defined manner for input into a System Under Test (SUT). With the use of well-defined RLMs specified as part of a test, the tester can run tests that are reproducible by other labs. Given that this specification discusses measuring AAL-5 performance of ATM systems, it is fitting that a method for defining standardized frame RLMs should be defined for use in that testing. The following sections define a consistent methodology for defining frame RLMs. This will allow testers to define test sources used for testing, without ambiguity. 3.3.1. Basic Framework of a Frame Reference Load Model There are two basic logical components for frame RLMs to fully characterize their cell-level traffic patterns. These components are the frame sources and the cell multiplexer. The frame sources and multiplexer are only logical concepts that do not need to exist in any implementation. They can be considered virtual devices. The frame source creates frames as sequences of cells with a well-defined pattern. Generally each source corresponds to a single VC. The multiplexer in turn describes how multiple frame sources, or perhaps cell sources as well, are put into a single output stream. For simplicity the input cell rate of a frame source to the multiplexer is defined to be equivalent to the output cell rate of the multiplexer. Figure 3.3: Logical model of a frame RLM The multiplexer conceptually contains cell buffers for each source and some arbitration device. The arbitration device specifies how cells from multiple sources will be placed on the output stream. Conceptually, the actual RLM (well-defined traffic pattern) is what appears at the egress of the multiplexer. The multiple sources and multiplexer are used to characterize the RLM only, and need not exist in any implementation. Using this framework requires the specification of two sets of parameters. These parameters are the frame source parameters for each source, and the multiplexer parameters. 3.3.2. Frame Source Parameters The following parameters can be used to characterize frame sources from a variety of applications. These parameters can be specified as having any mathematical, statistical or algorithmic distribution the author of a RLM feels is appropriate. This will allow for a wide range of useful RLMs. The set of required parameters is the minimal subset to be used for defining a frame source. Theoretically, they can be used to define the majority of input RLMs, however for some sources this may be difficult. The optional parameters are defined recursively so that long term RLMs can be defined more simply. Table 3.1: Frame Source Parameters Required Optional Parameter Units Comment Optional Number_Phases None Integer > 0. Optional Inter-Phase Gap Cells Number of idle cells between generation of consecutive phases. Optional Number_Active_Periods None Integer > 0. Optional Inter-Activity Gap Cells Number of idle cells between generation of consecutive activity periods. Optional Number_Intervals None Integer > 0. Optional Inter-Interval Gap Cells Number of idle cells between generation of consecutive intervals. Optional Number_Frames None Integer > 0. Required Inter-Frame Gap Cells Number of idle cells between generation of consecutive frames Required Frame Size Cells Integer > 0. Required Inter-Cell Gap Cells Number of idle cells between generation of consecutive user cells in a frame. The following diagram illustrates the hierarchical relationship of the frame source parameters. Figure 3.4: Relationship between Frame RLM source parameters 1. A test consists of one or more test phases. E.g., phases 1 and 2 might represent a test with one and ten traffic sources. 2. Each test phase (except the last) is followed by an inter-phase gap during which no traffic is generated. 3. A test phase consists of one or more periods of activity. E.g., a series of active periods might correspond to a series of traffic bursts. 4. Each active period (except the last) is followed by an inter-activity gap (idle period) during which no traffic is generated. 5. An active period consists of one or more characteristic intervals. E.g., intervals might correspond to the bin widths used by a self-similar traffic generator. 6. Each characteristic interval (except the last) is followed by an inter-interval gap during which no traffic is generated. 7. Traffic characteristics typically vary from one interval to the next. E.g., inter-frame gaps might vary among intervals, so that the inter-frame gap ~ exponential(10) in interval N and ~ exponential(15) in interval N+1. 8. Traffic characteristics typically do not vary within a characteristic interval. "Do not vary" does not imply that parameter values are constant. E.g., the length of the inter-frame gap during interval N is exponentially distributed with mean 10. However, parameter values are random variates drawn from this distribution and are not constant within interval N. 9. A characteristic interval consists of one or more frames. 10. Each frame (except the last) is followed by an inter-frame gap during which no traffic is generated. 11. A frame consists of one or more cells. 12. Each cell (except the last) is followed by an inter-cell gap during which no traffic is generated. 13. Gaps at any level can be of length zero. E.g., phases 1 and 2 might represent a test with one and ten traffic sources, and with no pause during the transition. 3.3.3. Multiplexer Parameters There is one parameter of interest to define the behavior of the multiplexer. Arbitration Algorithm - Algorithm used to arbitrate amongst the various frame sources. This algorithm must be a well-defined procedure that specifies which of the various sources with a cell to transmit, get to transmit a cell at any time. Included are any parameters that are required by the arbitration algorithm. The author of the frame reference load model will define the arbitration scheme for the multiplexer device. The specification of this arbitration scheme must fully specify how all sources will be multiplexed so that the traffic pattern for any RLM is well defined at the cell-level. The service rate of the arbitration device is assumed to be the output link rate, therefore idles occur in a RLM if and only if there are no sources with cells to send. 3.3.4. Definition of a Frame RLM To properly define a frame RLM, the following must be specified: 1. All frame sources must be defined and numbered. 2. The multiplexer arbitration scheme must be specified to arbitrate between the sources. To specify a frame source, the frame source parameters are defined as being distributed with well-defined distributions. These can be any mathematical, statistical or algorithmic distribution including constant. With proper definition of distributions for these parameters, a variety of useful frame sources can be constructed. Section 3.3.5 and Appendix E contain example Frame RLMs. To fully specify the multiplexer operation, the arbitration algorithm for the multiplexer must be stated. Definition of these parameters specifies a well-defined RLM. It should be noted that these RLMs might be statistical in nature. If any of the parameters are based on statistical distributions, then the cell-level traffic patterns are statistically reproducible over time. 3.3.5. Example of the Definition of a Frame RLM A hypothetical video compression frame stream could be defined to have the following characteristics: 1. There are two video frame sources with the following properties. * A new data frame starts every 9 cell times. * The compressed frame size varies uniformly between 3 and 6 cells. * The data frames burst at line rate. That is the ICG = 0. 2. The 2 sources are multiplexed with round robin arbitration. Therefore, the frame RLM would be defined with the following parameters: Source 0 FS = Uniform(3,6) ICG = 0 IFGi= 9 - FSi i = Frame Number Source 1 FS = Uniform(3,6) ICG = 0 IFGi= 9 - FSi i = Frame Number Arbitration; Round Robin Number_Of_Sources = 2. Round_Robin ( Number_Of_Sources ) { Current_Source = 0 while ( True ) { if ( Cell_To_Transmit ( Current_Source ) ) { Send_Cell ( Current_Source ) } Current_Source = ( Current_Source + 1) mod Number_Of_Sources } } The following diagram illustrates this example. Figure 3.5: Example Frame Reference Load Model of Two Video Sources 3.4. Test Configurations Each Reference Load (RL) defined in Section 3.3 must be verifiable at the measurement point MP1, defined in Section 3.2. This specification does not (and should not) mandate any particular method for generating a RL. Some test configurations may include connections that are routed through the IUT several times by looping links (as a way to produce a traffic load on a large number of IUT input interfaces while using a small number of traffic generators - see Appendix B). In such cases, since the behavior of the IUT is formally unknown, so is the exact nature of the aggregate traffic passing through it. Therefore, the traffic pattern is unverifiable at each subsequent ingress to the IUT, as some of the input traffic comes from the IUT outputs. Consider the following hypothetical example: we might attempt to measure the delay characteristics of two IUTs at an input port load of 98%. One IUT exhibits a CLR of 10-5 while the other 0.05. The load at the inputs of the IUT with the higher loss might be around 93% or lower while the load on the inputs of the other IUT would be much closer to the intended 98%. Since the behavior of the two IUTs is different, there is a discrepancy between the test conditions under which the two IUTs are being observed. The formal discrepancy of an unverifiable RL may be avoided by ensuring that those components of a test configuration that introduce dependencies on IUT performance are included in the definition of the system being tested. Therefore, should the test configuration incorporate looping configurations, the SUT should be defined to include the looping links. This approach has the advantage of being consistent with the definitions of scaleable test configurations in Appendix B. It is the responsibility of the tester to verify that the test configuration does indeed produce the specified RL (for example by moving a cell stream analyzer around to each of the inputs and repeating the test). A system with n ports can be tested for the following connection configurations (for connections internal to the SUT): * n-to-n straight, * n-to-(n(1) full cross, * n-to-m partial cross, 1 ( m ( n(1, * k-to-1, 11) to meet the desired burst size, they must be sent separated by a constant inter-frame gap. For VBR connections, the size of the gap determines the SCR, and for all other service categories, the inter-frame gap should be set so that PCR is maintained. Where appropriate, the specified CDVTs, MBS and MFS should be large enough to handle the generated traffic. 4. For repeated measurement of the MIMO latency for frame bursts, repeated bursts may be sent, separated by an inter-interval gap that is chosen to be large enough so that the negotiated traffic parameters of the connection are not violated. The latency measurement should be taken after sufficient time has passed to allow the connections to be beyond any initial settling period. It may also be of interest to measure the throughput for the above traffic sources concurrently. Good latency performance does not necessarily lead to good throughput performance and vice versa. 4.3. Fairness Index 4.3.1. Definition Given n virtual circuits sharing a system (a single switch or a network of switches) and contending for the resources, fairness index indicates how far the actual individual allocations are from the ideal allocations. Fairness index can be applied to several metrics, such as throughput and latency. In the simplest case for a total throughput T, the ideal allocation should be T/n. However, other fairness criteria may be used, such as those specified in Appendix I.3 of TM 4.1 [2] for the ABR service category. If the actual measured throughputs of n virtual circuits are found to be {T1, T2, ..., Tn}, where the ideal throughputs should be {, , ..., }, then the throughput fairness of the system under test is quantified by the "fairness index" computed as follows: Fairness index = ((xi)2 / (n ( ( xi2) where: * xi = Ti/ is the relative allocation to ith VC. There are two throughput fairness metrics that are of interest to users: * Peak throughput fairness: This is the fairness at a frame load for the peak throughput. * Full-load throughput fairness: This is the fairness at a frame load for the full-load throughput. In the case of Latency, {T1, T2, ..., Tn } represents the actual measured latency of n virtual circuits, while {, , ..., } should be the ideal latency (FILO0). The latency fairness of the system under test is quantified by the above defined "fairness index". However, extreme unfairness in latency is expected to usually show up as unfairness in throughput and vice-versa. Therefore, it is not required to quantify fairness of latency. 4.3.2. Units This fairness index is dimension-less. The units of measurements made to calculate the fairness index (bits/sec, cells/sec, or frames/sec) do not affect its value. In addition, the fairness index has the following desirable properties: * It is a normalized measure that ranges between zero and one. The maximum fairness is 100% and the minimum 0%. This makes it intuitive to interpret and present. * If all xi's are equal, the allocation is fair and the fairness index is one. * If n-k of n xi's are zero, while the remaining k xi's are equal and non-zero, the fairness index is k/n. Thus, a system which allocates all its capacity to 80% of VCs has a fairness index of 0.8 and so on. 4.3.3. Measurement Procedures The following are examples for applying the fairness index to the throughput metric. To measure a peak throughput fairness, the peak throughput for the given SUT must be first obtained as described in Section 4.1.3. An experiment for peak throughput fairness is performed by generating the input load corresponding to the peak throughput for time (t) and recording throughput for each foreground virtual circuit. The experiment is repeated p times. Here t is expressed in seconds, and p is a parameter whose default value is 30. Similarly, to measure a full-load throughput fairness index, the full-load throughput for the given SUT must be obtained first as described in Section 4.1.3. 4.4. Frame Loss Ratio 4.4.1. Definition Frame Loss Ratio is defined as the fraction of frames that are corrupted or not forwarded by a System Under Test (SUT) due to several reasons, such as lack of resources. Following the approach adopted in the definition of other metrics in this specification two frame transfer outcomes are first proposed, and are then used to define a Frame Loss Ratio (FLR). This Frame Loss Ratio is proposed as a performance parameter for characterizing the information loss performance of frame transfer outcome. Using the definitions for frame-based Reference Events, two frame transfer outcomes are defined: 1. Successful Frame Transfer Outcome is defined as the occurrence at MP2 of a FRE2 corresponding to the occurrence at MP1 of a FRE1, within a time Tmax to be specified, if each user information bit in the FRE2 is identical to the corresponding user information bit in the corresponding FRE1. 2. Corrupted Frame Transfer Outcome is defined as being either (a) the lack of occurrence at the MP2 of a FRE2 corresponding to the occurrence at MP1 of a FRE1, within a time Tmax to be specified, where each user information bit in the FRE2 must be identical to the corresponding user information bit in the corresponding FRE1, or (b) the occurrence at MP2 of a FRE2 for which there is no corresponding FRE1. Use of a limit, Tmax, on the maximum permissible frame transfer time classifies unduly delayed frames as corrupted frame transfer outcomes. The possible occurrence of a "lost" frame is included in the above definition as a Corrupted Frame Transfer Outcome. The possible occurrence of a "misinserted" frame is included in the above definition as corrupted frame transfer outcome. This definition of corrupted frame transfer outcome does not distinguish between the failure of a FRE2 to match bit-for-bit with the user information in the corresponding FRE1 due to the supporting ATM connection's experiencing a lost cell, errored cell, or misinserted cell. Regardless of the underlying impairment cause at the supporting ATM layer, this resulting frame is unlikely to be usable by most higher layer applications and therefore contributes to information loss for this frame transfer outcome. The Frame Loss Ratio is defined as: where the Successful Frame Transfer Outcomes and Corrupted Frame Transfer Outcomes belong to some population of interest. This population of interest could, for example, contain all such frame transfer outcomes that occur between a specific pair of MPs during a stipulated time interval. There are two frame loss ratio metrics that are of interest to a user: * Peak throughput frame loss ratio: This is the frame loss ratio at a frame load for the peak throughput. * Full-load throughput frame loss ratio: This is the frame loss ratio at a frame load for the full-load throughput. The Frame Loss Ratio for point-to-multipoint, multipoint-to-point and multipoint-to-multipoint connections is for further study. An alternative metric, called Application Goodput (AG) can be defined as: Application Goodput = Frames Received / Frames Transmitted This metric captures the notion of what an application sees as useful data transmission in the long term. At the AAL-level, AG can be calculated as follows: AG = 1 - FLR 4.4.2. Units The frame loss ratio is expressed as a fraction of input frames. 4.4.3. Measurement Procedures The frame loss ratio metric is related to the throughput: Frame Loss Ratio = (Input Rate - Throughput)/Input Rate Thus, no additional experiments are required for frame loss ratios. These can be derived from tests performed for throughput measurements. 4.5. Maximum Frame Burst Size (MFBS) 4.5.1. Definition Maximum Frame Burst Size (MFBS) is the maximum number of frames that each of source end systems can send at the peak rate through a system under test without incurring any loss. MFBS measures the data buffering capability of the SUT and its ability to handle back-to-back frames. Many applications and transport layer protocol drivers often present a burst of frames to AAL for transmission. For such applications, Maximum Frame Burst Size provides a useful indication. This metric is particularly relevant to UBR service category since the UBR sources are always allowed to send a burst at peak rate. ABR sources may be throttled down to a lower rate if a switch runs out of buffering resources. 4.5.2. Units MFBS should be expressed in octets of AAL payload field. This is preferred over number of frames or cells. The former requires specifying the frame size and the latter is not very meaningful for a frame-level metric. Also, number of cells has to be converted to octets for use by AAL users. 4.5.3. Measurement Procedures The MFBS is measured for the k-to-1 connection configuration as shown in Figure 3.6. Thus, k VCCs (or VPCs) are established through the SUT. The measurement procedure may require a number of tests. Each test includes simultaneous generations of fixed length bursts of back-to-back cells through all k VCCs (or VPCs) and counting of all cells transmitted by the SUT. If there is no loss of cells, the length of bursts is increased, but if there is a loss, the length of bursts is decreased. In both cases, the next test is performed with the new burst length. The procedure is finished when the maximum cell burst size (MCBS) is found. MCBS is the maximum burst length for which there is no cell loss. Tests are conducted without any background traffic. Given MCBS, one can calculate the maximum integral number of back-to-back frames of a given size, which can be sent into the SUT of the given connection configuration and delivered by the SUT without any loss. This integral number then converted to octets of AAL payload field to obtain the Maximum Frame Burst Size (MFBS). There is no need for obtaining more than one sample for MFBS. Consequently, there is no need for calculation of the means and/or variances. 5. References [1] af-test-0022.000 (Introduction to ATM Forum Test Specifications 1.0), 1994 [2] af-tm-0121.000 (Traffic Management Specification 4.1), 1999 [3] ITU-T Recommendation I.150 (1995), B-ISDN Asynchronous Transfer Mode Functional Characteristics [4] ITU-T Recommendation I.350 (1993), General Aspects of Quality of Service and Network Performance in Digital Networks, Including ISDNs [5] ITU-T Recommendation I.353 (1996), Reference events for defining ISDN and B-ISDN performance parameters [6] ITU-T Recommendation I.356 (1996), B-ISDN ATM layer cell transfer performance [7] B. Efron and B. Tibshirani, An Introduction to the Bootstrap, Chapman and Hall, 1993. [8] ITU-T Recommendation I.361 (1995), B-ISDN ATM layer specification [9] ITU-T Recommendation O.191 (1997), Equipment to assess ATM layer cell transfer performance [10] IETF RFC1242-1991, Benchmarking Terminology for Network Interconnection Devices [11] ITU-T Recommendation X.135 (1992), Speed of service (delay and throughput) performance values for public data networks when providing international packet-switched services [12] ITU-T Recommendation I.610 (1995), B-ISDN operation and maintenance principles and functions [13] ITU-T Recommendation X.144 (1995), User information transfer performance parameters for data networks providing international frame relay PVC service [14] ITU-T Draft Recommendation X.144 Amendment 1, Annex C(1996), Some Relations Between Frame-level and ATM-level Performance Parameters, published in COM 7-13, 1997 [15] ITU-T Recommendation X.145 (1996), Performance for data networks providing international frame relay SVC service [16] Computer Performance Modeling Handbook, S. Lavenberg (ed.), 1983 [17] IETF RFC 1242, Benchmarking Terminology for Network Interconnection Devices Appendix A: Defining Frame Latency on ATM Networks A.1. Introduction This appendix discusses delays, and the performance metrics characterizing them, that an ATM network introduces to its frames. We are concerned with delays caused by node processing, such as switching and routing, as well as queuing delays that may be introduced by the background traffic and inter-network link transmission delays. On the other hand, transmission delays introduced by input and output links of a network component should not be attributed to the component. Also, note that characteristics of traffic generators (e.g., host speeds) should not affect network performance metrics. The discussion in this Appendix applies to any network element (including switches, multiplexers, inverse-multiplexers, wires) or any combination of such network elements. Although we frequently use the term "switch," the discussion applies equally well to other network elements, whole networks, or parts of networks. In the case of a single bit, the switch (network) delay is generally defined as the time between the instant the bit enters the system and the instant the bit exits from the system. Figure A.1 illustrates the single-bit latency. Figure A.1: Latency for a Single Bit For multi-bit frames, the usual way to define the frame latency introduced by a switching device is to apply one of the following four definitions: * FIFO latency: Time between the first-bit entry and the first-bit exit * LILO latency: Time between the last-bit entry and the last-bit exit * FILO latency: Time between the first-bit entry and the last-bit exit * LIFO latency: Time between the last-bit entry and the first-bit exit Figure A.2 illustrates the usual frame latencies (FIFO, LILO, FILO and LIFO) in a scenario with a contiguous frame on both input and output, passing through the given communication network which has an input link rate lower than the output link rate. Figure A.2: Usual Frame Latencies Unfortunately, as it will be shown later, none of the four above metrics is appropriate for an equipment perspective of frame latency. In this appendix, we introduce and justify a new latency metric called "MIMO" latency. This new latency metric applies to any type of network where the frames may be contiguous or discontinuous, although our primary interest is an ATM environment. To define the MIMO latency, we introduce the concept of a "zero-delay" switch, which is in some sense the best a switch can do. The delay of any other switch is defined as the latency over and above the delay of a zero-delay switch. This appendix is organized as follows. In the next section, we discuss the applicability of various latency metrics to the ATM environment. We introduce the MIMO latency in Appendix A.3. In Appendix A.4, we introduce the concept of a zero-delay switch and its processing of individual cells and contiguous frames. We discuss delays introduced to discontinuous frames passing through a zero-delay switch in Appendix A.5. Appendix A.6 presents the method for calculating the FILO latency of frames passing through a zero-delay switch. An equivalent, but easier to use, definition of MIMO latency is developed in Appendix A.7. Appendix A.10 presents derivations of expressions for MIMO latency calculation based on cell-level data. Appendix A.11 discusses the user perceived delay in data communication networks. Appendix A.12 references other delay metrics. A.2. Usual Frame Latencies as Metrics for ATM Switch Delay An ATM switch has to deal with both contiguous and discontinuous frames. This is because ATM switches do cell-switching, i.e., an ATM switch may transmit a received cell of any frame without first waiting for other cells of that frame to arrive. Thus, frames sent and received in an ATM environment are not always contiguous. Even if the input frame is contiguous, the ATM switch may transmit discontinuous frames, i.e., it may introduce idle periods, unassigned cells and/or cells of other frames between cells of the frame. The above factors make the usual frame latency metrics inappropriate for ATM switches. In this section, we show why LIFO, FIFO and FILO latencies are not appropriate metrics for an equipment perspective. Later in this appendix, we shall show that FILO latency is an appropriate metric for user perceived performance. A.2.1. LIFO Latency In [11], the delay in a packet-switching network is defined as the time between a "packet entry event" and a "packet exit event." A packet entry event is defined to occur at the time when the last bit of the frame enters a network, while a packet exit event is defined to occur when the first bit of the frame exits a network. This is equivalent to LIFO latency, which is considered as an appropriate metric for store-and-forward packet-switching networks because: * packets (frames) are contiguous on both input and output and * it is accepted that the transmission delay during packet input is an intrinsic delay for a store and forward device, for which the switch should not to be penalized. Newer networking devices are not necessarily store-and-forward. Some of them are cut-through devices that start emitting the frame before it is received completely. Figure A.3 illustrates the case of a frame passing through a cut-through switching device with three of the four usual latencies indicated. LIFO latency is not shown because the first bit of the frame exits before the last bit of the frame enters and the LIFO latency is negative. This is a common case with cut-through devices. Thus, LIFO latency is not a good indicator of the switch delay for any cut-through type device, and as such it is inappropriate for an ATM environment, where cut-through forwarding of frames is the normal mode of operation. Figure A.3: Latencies of a Frame passing through a Cut-Through Switching Device A.2.2. FIFO Latency It is interesting to note that [17] provides a LIFO latency definition as the delay metric for store and forward switching devices, as well as a FIFO latency definition for bit forwarding devices (i.e., cut-through switching devices). The introduction of FIFO latency as a delay metric is an attempt to avoid negative values for the delay through cut-through devices. While FIFO latency may provide meaningful results if the frames are continuous, it may provide useless results if the frames are discontinuous. It is possible to have a very low FIFO delay while delays for the other parts of the frames are high. Again, since frames on ATM networks are generally discontinuous, FIFO latency is not a meaningful measure of frame latency. Figure A.4 illustrates this point. Figure A.4: Usual Latencies in an ATM Environment In this case, the frame consists of 3 cells passing through an ATM switch with the input link rate higher than the output link rate. The frame is discontinuous on both input and output. The last cell is delayed considerably more than what FIFO latency would indicate. It is possible to have one pattern of idle periods or unassigned cells (positions and a number of them) on the input of a given frame, and a completely different pattern on the output of the same frame. Note that it is also possible for a switch to remove idle periods or unassigned cells from the input, "transmitting" fewer of them on output, as we shall illustrate later. In Figure A.4, as well as in the rest of this appendix, an unassigned cell, an idle period or a cell of another frame between cells of a given frame is indicated as a gap. In Figure A.4 the frame on input has a one-cell gap after the first cell of the frame, followed by the two remaining cells of the frame. On output, there is a two-cell gap after the first cell and then a one-cell gap between the second and the third cell of the frame. From Figure A.4, it can be observed that it is possible for a switch to have a small FIFO latency if the first cell of a frame is transmitted quickly. However, if the later cells are delayed considerably, the receiver is not able to assemble the frame. FIFO latency does not reflect the expansion and compression of gaps on output. This is why FIFO latency is not an appropriate delay metric for switches in the ATM environment. A.2.3. FILO Latency From any of the previous three figures it can be noted that the relationship between FILO and LILO latency is as follows: FILO latency = LILO latency + Frame Input Time FILO latency is different for different frame input patterns. Suitability of LILO and FILO metrics under various circumstances is discussed after introducing MIMO latency in the next section. A.3. MIMO Latency Definition MIMO latency (Message-In Message-Out) is a performance metric that defines the delay introduced upon a frame passing through a switch (or any other network component). When applied to a single switch, the MIMO latency accounts only for delays introduced by the switch (because of switching and other processing) and is independent of the frame input time, output transmission time, and other physical layer delays introduced on the input and output links. Succinctly, MIMO latency is defined as follows: MIMO latency = FILO latency - FILO0 where * FILO0 is equal to the FILO latency of a given frame passing through a zero-delay switch. We define a zero-delay switch as a switch that handles incoming frames in such way that they are transmitted on the output link without any time consuming processing. The above definition implies that MIMO latency is the difference between the measured FILO latency of a frame passing through the given switch and the FILO latency of the same frame passing through a zero-delay switch. As defined, MIMO latency has the desired property of always being positive (or zero for a zero-delay switch). The MIMO latency is not limited to switches. It applies to all types of communication devices, including repeaters, multiplexers, (store-and-forward or cut-through) bridges, routers, ATM switches, wires, or any combination of these. MIMO latency also accounts for discontinuous frames on the input and/or output. For discontinuous frames on input, gaps may include idle periods, unassigned cells and/or cells from other frames. For discontinuous frames on output, it is assumed that there are no cells from other frames inserted between the cells of the given frame, but idle periods or unassigned cells are allowed. It should be realized that the last assumption does not present a limitation for measurements in benchmarking environments. In the following two sections, we explore the concept of a zero-delay switch in depth. A.4. Cell and Contiguous Frame Latency Through a Zero-Delay Switch Figure A.5 illustrates the latency that one-bit frame would experience while passing through a zero-delay switch. As expected, a zero-delay switch should start transmission on the output link as soon as the bit arrives on the input link. Thus, the latency of a single bit through a zero-delay switch is equal to zero. A wire of a zero length is one example of a zero-delay switch. Figure A.5: Latency of One Bit passing Through the Zero-Delay Switch Figure A.6 illustrates how a zero-delay switch would handle a cell consisting of multiple bits. The desired performance depends upon the relationship between the input and output link rates. In the case when the input link rate is equal to the output link rate, as presented in Figure A.6a, a zero-delay switch transmits each bit as soon as it arrives. Thus, each bit of the cell experiences zero latency in a zero-delay switch. Figure A.6b illustrates the case when the input link rate is higher than the output link rate. In this case, outputting (transmitting) a bit takes longer than inputting it. The zero-delay switch can transmit only the first bit as soon as it is received. The other bits of the cell can not be transmitted immediately as they arrive, because the transmission of all previously received bits has not yet finished. Bits at the end of the cell wait longer then bits at the beginning. Thus, a zero-delay switch in this situation should be intelligent enough to do appropriate buffering of incoming bits. A zero-length wire with a FIFO buffer is an example of a zero-delay device that can handle inputs faster than the output. Figure A.6c illustrates the case when the input link rate is lower than the output link rate. A zero-delay switch does not start transmission of the first bit immediately after it is received, but after an appropriate delay. Bits at the beginning of the cell are delayed more than bits at the end, with larger delays for slower output link rates. Only the last bit of a cell has no delay and it is transmitted immediately upon its arrival. Thus, a zero-delay switch would be intelligent enough to avoid under-runs by appropriately delaying the transmission of incoming bits. A zero-length wire with an "intelligent" FIFO buffer is an example of such a zero-delay device. It should be realized that the illustrations in Figure A.6 apply not only to cells, but also to contiguous frames passing through a zero-delay switch. Note that a repeater can be considered as a zero-delay switch with input link rate equal to output link rate. Thus, Figure A.6a illustrates how a repeater handles incoming frames. Also, note that a multiplexer, with n links on input and the output link capacity equal to the sum of input link capacities, can be considered as a zero-delay switch with input link rate lower than output link rate. For a multiplexer with two input links of rates equal to one half of the output link rate, Figure A.6c illustrates how the multiplexer would handle incoming frames. Similarly, a demultiplexer can be considered as a zero-delay switch with an input-link rate higher than the output-link rate. Figure A.6b illustrates operation of a two-output demultiplexer. Based on Figure A.6, Table A.1 provides (qualitative) indications for the four usual frame latency metrics applied to a zero-delay switch. None of the latencies has a zero value in all three cases, as it should be for the latency of a frame passing through a zero-delay switch. Table A.1: Usual Latencies Applied to a Zero-Delay Switch FIFO LILO LIFO FILO Input rate = Output rate 0 0 negative positive Input rate > Output rate 0 positive negative positive Input rate < Output rate positive 0 negative positive A.5. Latency of Discontinuous Frames Passing Through a Zero-Delay Switch In this section, we consider how a zero-delay switch handles discontinuous frames in an ATM environment. In particular, we are interested in FILO latency, since it is used in the MIMO latency definition. Figure A.7 illustrates one of two possible cases of a frame passing through a zero-delay switch with an input link rate higher than the output link rate. The frame includes two cells and the input link rate is 4 times the output link rate. The two cells start arriving at time t = 0 and t = 5, respectively. A zero-delay switch will start transmitting the first cell at time t = 0 and finish at time t = 4. The second cell can be transmitted without waiting and it is finished at t = 9. This is how long a zero-delay switch will take to transmit this frame. Hence, FILO latency of a zero-delay switch for this frame is 9. This is the normalized frame output time (FILO0) for this input pattern. No device can transmit this frame any faster. If a device takes longer, the difference between the FILO latency of the device and FILO0 is considered as the delay introduced by the device. Figure A.7: Zero-Delay Switch Operations, no Cell Waiting Case (Input rate > Output Rate) Figure A.8 shows the other possible case of a frame passing through a zero-delay switch with an input link rate higher than the output link rate. As in Figure A.7, the frame has two cells and the input link rate is 4 times the output link rate. However, the frame has a different gap pattern. The second cell arrives at time t = 2 and thus has to wait. A zero-delay switch will start transmitting the first cell at time t = 0 and finish at time t = 4. The second cell can be transmitted at t = 4 and finished at t = 8. Hence, FILO latency of a zero-delay switch for this frame is 8. Thus, in the case when the input link rate is higher than the output link rate, it is possible that: * an incoming cell can be transmitted immediately (no cell waiting case), or * an incoming cell has to wait for previously received cells of the same frame to be transmitted (cell waiting case). Thus, for a given discontinuous frame, it is possible that some cells have to wait on previously received cells of the same frame, while some cells can be transmitted without waiting. Also, notice that a zero-delay switch is decreasing the size of each gap from the input, with some gaps being completely removed. Figure A.9 illustrates the only possible case of a frame passing through a zero-delay switch with an input rate lower than the output rate. Again, the frame includes two cells but the output link rate is now four times the input link rate. The two cells arrive at time t = 0 and t = 5, respectively. A zero-delay switch will start transmitting the first cell at time t = 3 (not at t = 0, in order to avoid an underrun), and finish at time t = 4. The second cell starts at t = 8 and finishes at t = 9. This is how long a zero-delay switch will take to transmit this frame. Hence, the FILO latency of a zero-delay switch for this frame is 9. Note that in the case when the input rate is lower than the output rate, a cell never has to wait for completion of transmissions of previously received cells. Also, notice that in this case, a zero-delay switch does not eliminate any gaps from the input, although each gap is enlarged on output. Additionally, when back-to-back cells are received on the input, new gaps are introduced between cells on the output. A.6. Calculation of FILO Latency for a Zero-Delay Switch The MIMO definition introduces FILO0 as the FILO latency of a frame passing through a zero-delay switch. In this section, we explain how to obtain FILO0 "on the fly," i.e., when a frame pattern is not known in advance, but cell arrival times can be obtained in real time. We define the following parameters: * CIT = cell input time = 424 [bits] / Input Link Rate [bits/sec] * COT = cell output time = 424 [bits] / Output Link Rate [bits/sec] The procedure for FILO0 calculation is as follows: a. Initially FILO0 = 0 and time t is measured from the arrival of the first bit of the first cell in a zero-delay switch. b. For each cell with its first bit arriving at time t, update FILO0 as follows: FILO0 = max{t, FILO0} + CT where: A.7. Equivalent MIMO Latency Definition An equivalent MIMO latency definition, which is more convenient for use in frame latency measurements and calculations when the input link rate is lower than or equal to the output link rate, can be derived as follows. Input link rate ( output link rate, implies that CIT ( COT. A zero-delay switch will transmit the last bit of each cell of the frame as soon as it is received. In particular, the last bit of the frame is transmitted as soon as it is received. Thus, FILO0 in these cases is equal to the frame input time: FILO0 = Frame Input Time and, MIMO latency = FILO latency - FILO0 = FILO latency - Frame Input Time = LILO latency Then the equivalent MIMO latency definition is: Throughout this discussion, we assume that the link rates are used in latency computation. If other rates are used, there is the potential for strange results. For example, it is possible that a carrier may offer a lower rate contract to a customer on a higher rate link. If the peak cell rate for the traffic contract is less than the link rate, and this peak cell rate is used for MIMO calculations, then the MIMO value may be negative, depending on the scheduling of cells on the link and the traffic contract. Using the link rate in MIMO calculations avoids this potential problem. A.8. An Alternative Definition of MIMO MIMO latency of a switch is defined as: MIMO = FILO -FILO0 Where FILO is the measured first-bit-in to last-bit-out latency and FILO0 is the FILO latency of an ideal switch for the same input pattern. Note that FILO is the sum of frame input time (first bit in to last bit in) and the LILO (last bit in to last bit out) latency (see Figure A.10): FILO = Frame Input Time + LILO and FILO0 = Frame Input Time + LILO0 Since the frame input time does not depend upon the switch, MIMO can also be expressed as: MIMO = LILO - LILO0 Here, LILO is the measured LILO latency and LILO0 is the LILO latency of an ideal switch for the same input pattern. Given input and output speeds, LILO0 can be easily computed. Figure A.11 shows three possible cases. The figure shows that LILO0 is zero unless input speed is faster than the output speed. LILO0 = 0 if input speed < output speed Figure A.10: Relation between MIMO and LILO a. Input Speed < Output Speed b. Input Speed = Output Speed c. Input Speed > Output Speed Figure A.11: An ideal switch introduces a nonzero LILO latency only when input link speed is greater than the output link speed. A.9. MIMO Latency of a Path Consider a network path consisting of n components in a series. Subscript i is used for the latency of the ith component and subscript ? for the combination. Thus, MIMOi = LILOi - LILO0i Similarly for a network path: MIMO? = LILO? - LILO0? Since LILO is additive: LILO? = ? LILOi The above three relationships lead us to the following identity: MIMO?+LILO0? = ? (MIMOi + LILO0i) Or MIMO? = ? MIMOi + ? LILO0i - LILO0? This relationship allows us to compute MIMO of a series of components from the measured MIMO values of individual components. Note that LILO0i and LILO0? can be computed given the input pattern and the input output speeds. We illustrate this with a few examples. Example 1: Consider the configuration shown in Figure A.12 consisting of two switches interconnected via a wire. All links and ports are 150 Mbps. The input frame is composed of two cells with a gap of 3 cell times. Let us suppose that each switch introduces a MIMO equal to c, where c is the cell time at 150 Mbps. Also, for simplicity assume that the wire between the switches also introduces a MIMO of c. (Other wire lengths can be handled similarly). Since all input/output speeds are the same, an ideal switch will produce zero LILO latency. Hence, LILO0i = 0 LILO0? = 0 MIMO? = ? MIMOi = c + c + c = 3c That is, the MIMO latency is simply the sum of the individual MIMO latencies. Figure A.12: MIMO Aggregation for Example 1. Example 2: Consider the configuration shown in Figure A.13. This is similar to the configuration of Example 1, except that the intermediate link is 50 Mbps and therefore, introduces a delay of 3c. In this case, the first switch has an input speed of 150 Mbps, while the output speed is 50 Mbps. An ideal switch with these I/O speeds will produce a LILO latency of 2c, where c is the cell time at 150 Mbps. That is, LILO01 = 2c For the wire as well as the second switch, the input speed is equal to or less than the output speed and so the LILO0 is zero: LILO02 = 0 LILO03 = 0 Figure A.13: MIMO Aggregation for Example 2. If the whole network is replaced by a single ideal switch, that switch will have an input speed of 150 Mbps and an output speed of 150 Mbps and therefore, will have a zero LILO latency. That is, LILO0? = 0 Using the above values, we get: MIMO? = ? MIMOi + ? LILO0i - LILO0?????(c+3c+c) + (2c+0+0) - 0 = 7c A.10. Measuring MIMO Latency To measure MIMO latency for a frame passing through the System Under Test (SUT), the times of occurrence for the following two events need to be recorded: * the first-bit of the frame enters into the SUT, * the last-bit of the frame exits from the SUT. The time between these two events is the FILO latency. FILO0 can be obtained from the cell pattern of the test frame on input as explained in Appendix A.6. Substituting FILO latency and FILO0 into the MIMO latency formula would give the SUT's delay for a given frame. If the input link rate is lower than or equal to the output link rate, it is easier to calculate MIMO latency. In this case, the times of occurrence for the following two events need to be recorded: * the last-bit of the frame enters into the SUT, and * the last-bit of the frame exits from the SUT. The time between these two events is the LILO latency, which is equal to the MIMO latency for the frame. Note that the cell arrival pattern does not matter in this case. Contemporary ATM monitors provide measurement data at the cell-level. Considering that the definition of MIMO latency uses bit-level data, we now describe how to calculate MIMO latency using measurements at the cell-level. Standard definitions of two cell-level performance metrics, which are of importance for MIMO latency calculation are: * cell transfer delay (CTD), defined as the time between the first bit of the cell entering the switch and the last bit of the cell leaving the switch, and * cell inter-arrival time, defined as the time between arrival of the last bit of the first cell and arrival of the last bit of the second cell. In cases where input link rate is higher than output link rate, according to the MIMO latency definition, FILO latency has to be measured. From Figure A.14, it can be observed that: FILO latency = First cell's transfer delay + First cell to last cell inter-arrival time Thus, to calculate MIMO latency when the input link rate is higher than or equal to the output link rate, it is necessary to measure the transfer delay of the first cell of a frame and the inter-arrival time between the first cell and the last cell of a frame. In cases when input link rate is lower than or equal to output link rate, it is sufficient to measure LILO latency. From Figure A.15, it can be observed that: LILO latency = Last cell's transfer delay - CIT Thus, to calculate MIMO latency when the input link rate is lower than or equal to the output link rate, it is necessary to measure the transfer delay of the last cell of a frame. A.11. User Perceived Delay It should be pointed out that MIMO latency measures only the SUT's contribution to the delay. It does not include the delay caused by components not in the SUT's control. In particular, it does not include the frame input time. However, a user using the system does have to wait while the frame is being sent to the SUT. A user typically assembles the frame and gives it to the network. The user starts waiting as soon as the first bit starts entering the system and cannot do any meaningful work until the last bit exits the network. Thus, user perceived performance is reflected by FILO latency. Figure A.14: FILO Latency Calculation (Input Rate > Output Rate) Figure A.15: LILO Latency Calculation (Input rate ( Output Rate) Figure A.16 illustrates the relationships between the user perceived performance and MIMO latency in two scenarios with continuous frames. In the first scenario, the input link rate is same as the output link rate. In the second scenario, the output is slower. The switch delay, as given by MIMO latency, is same in both cases; but the user perceived delay, as given by FILO latency, is different. For the case in Figure A.16b, FILO latency is worse. It can be observed that the user perceived delay depends upon input/output link speeds. On the other hand, network delay measured by MIMO latency is independent of link speeds. The difference between those two delays is the frame latency through a zero-delay switch. Figure A.16: FILO Latency as User Perceived Delay A.12. Other Delay Metrics T1A1.3 is considering a Last Cell Delay (LCD) frame latency metric that indicates the latency induced to the last cell by the SUT. LCD and LILO allow additivity and allocation of end-to-end frame latency. Appendix B: Methodology for Implementing Scalable Test Configurations B.1. Introduction Throughout this appendix it is assumed, for improved readability, that the IUT consists of a single switch, although the methodologies presented here apply equally to test cases in which the IUT is a network of switches or, alternatively, a subset of modules of a single switch. The notation Pij is used to refer to the jth port of the ith module of the IUT, and (Pab, Pcd) indicates that a connection (either internal or external to the IUT) exists between Pab and Pcd. In Section 3.3, a number of connection configurations have been presented. In most of the cases, these configurations require one traffic generator and/or analyzer for each switch port. Thus, the number of generators and/or analyzers increases as the number of ports increases. It is desirable to define scalable configurations that can be used with a limited number of generators. However, one problem with scalable configurations is that there are many ways to set up the connections and measurement results could vary with the setup. For example, in the case of unicast, it may not be possible to overload a port with only one generator. Using two generators in scalable configurations may exhibit different behavior, such as overloading, that may not show up with one generator. Performance testing requires two kinds of virtual channel connections (VCCs): foreground VCCs (traffic that is measured) and background VCCs (traffic that simply interferes with the foreground traffic). The methodology for generating configurations of both types of VCCs is covered by this appendix. The VCCs are formed by setting up connections between ports of the switch. The connections are internal through the switch fabric and external through some transmission medium or wires (which could be cables, fibers, or even wireless links), depending on the port technology. In this Appendix, internal connections are shown by thin lines and external connections by thick lines. ATM connections (those established internal to the IUT) are inherently bi-directional. A uni-directional test configuration is one in which the flow of test traffic occurs in only one direction across each ATM connection. Alternatively, a bi-directional test configuration would exercise both directions. Whenever external connections are used in the test configurations, only permanent VCCs can be established. Two generic categories of scalable configuration are presented in this appendix, namely: 1. "Parallel Traffic Replication" configurations, discussed in Appendix B.2, which employ the point-to-multipoint capability of a switching system (other than the IUT) to artificially generate more traffic than is possible with a limited number of traffic generators, and 2. "Serial Traffic Replication" configurations, discussed in Appendix B.3, which employ external connections to serially relay traffic egressing from the IUT back in to the IUT, thereby emulating additional traffic generators. B.2. Parallel Traffic Replication The point-to-multipoint capability of a switching system is a method for traffic replication intended primarily for broadcast communication services, but which also lends itself well to the task of generating the traffic inputs to the IUT that are required for the test configurations shown in Figure 3.6. Identical ATM cells are broadcast in parallel from multiple output ports - hence 'parallel traffic replication'. Given a single traffic generator, and a switching system (other than the IUT) with a point-to-multipoint capability (a multicast switch), an IUT may receive traffic on as many input ports as the multicast switch has available output ports. This form of scalable configuration is depicted in Figure B.1, where G is the single traffic generator. Internal to the IUT, any of the configurations from Figure 3.6 may be used. Note that if the multicast switch does not support multipoint-to-point connections, then this form of parallel traffic replication cannot support bi-directional test configurations. In such cases, it may be necessary to use serial traffic replication, as described in the next section. Also, the measured results may be affected by the performance of the multicast switch. Figure B.1: A Parallel Traffic Replication Scalable Configuration B.3. Serial Traffic Replication The serial traffic replication method for generating large bandwidths of test traffic requires a minimal number of test ports. However serial traffic replication may lead in some cases to misleading results about the performance of the SUT. There is no guarantee that an SUT that performs poorly with serial traffic replication may perform poorly with parallel traffic replication or under real-world conditions. Interpretation of results from traffic replication configurations requires a detailed understanding of the SUT, input test sources and proper analysis techniques. For instance, the throughput of an SUT with serial traffic replication may be less than the throughput of the SUT with parallel traffic replication or real-world conditions. In such cases, the use of serial traffic replication may nullify any gains achieved by statistical multiplexing. Therefore, when using this method, caution must be taken when interpreting results for the performance of the SUT. An example test configuration employing serial traffic replication is provided in Figure B.2, which shows a 4-port switch with ports labeled as P11, P12, P21 and P22. Of these, ports P21 and P12 are connected by a wire W1, while port P22 has a "loopback" wire LB that connects the output of the port to its input. Internally, a PVC has been set up to connect ports P11 with P21 and P12 with P22. Note that all external connections (wires) and internal connections (PVCs) in this case are bi-directional, except the loopback. During testing with this configuration, cells first enter the switch at P11 and are passed through every port of the switch in series, before looping back at P22 and following the reverse path back to exit the switch for the last time at P11. The methodology presented here has two phases. During the first phase the switch ports are connected externally by numbered wires, as in Appendix B.3.1. The second phase consists of setting up PVCs, i.e., internal connections between ports, as explained in Appendix B.3.2. The sequence of concatenated connections (internal and external) is called a VCC Chain. For example, the VCC shown in Figure B.2.b is formed by setting up a VCC chain starting from P11-In. ATM cells flow internally from P11-In to P21-Out, externally via wire W1 to P12-In, internally to P22-Out, externally via wire LB to P22-In, internally to P12-Out, externally through wire W1 to P21-In, finally exiting at P11-Out. This VCC chain can be indicated as: generator-P11-P21-P12-P22-P22-P12-P21-P11-analyzer. Of these connections, P22-P22 is a unidirectional external connection (loopback, denoted as LB) and P12-P21 is a bi-directional external connection (wire, denoted as W). The sequence of external connections used in this VCC chain is: Generator-W1-LB-W1-Analyzer. Both the above notations are symmetric in the sense that the second half of the chain is a mirror image of the first half. For example, W1-LB is the mirror image of LB-W1. . Figure B.2: A VCC Chain that can Implement the 4-to-4 Straight Configuration. Another possible configuration for this "n-to-n single generator scalable configuration" is P11-P12-P21-P22-P22-P21-P12-P11. The various VCC chains may be distinguished by the order of, and the direction through which, each wire is initially traversed by the generated traffic. The four-port switch shown in Figure B.2 consists of two modules with two ports each. The measured performance for a given test configuration may depend upon whether the internal connections of the VCC chain are inter-module, intra-module, or a mixture of both. The methodology presented in this appendix ensures that it is possible for exclusively inter-module, or intra-module traffic to be carried. The trivial case occurs when the IUT consists of a single module. In such cases, all ATM connections are intra-module. B.3.1. Implementation of External Connections The methodology for implementing the external connections consists of the following three steps: 1. Identify the modules to be included in the IUT and label the ports (using Pij format). 2. Connect the generators and analyzers to the appropriate ports. 3. Establish and number external connections (wires) to use all the remaining ports of the IUT. These steps are now explained. Step 1. Identifying the Modules to be Included in the IUT In order to ensure that it is possible for the configuration to support exclusively inter-module and/or intra-module internal connections, the IUT should consist of pairs of similar modules. If this constraint is not satisfied, the VCCs that are established may be a mixture of inter-module and intra-module connections. It is not necessary that the modules/ports be labeled, although we use the Pij format here to assist in the description of the methodology. Consider a switch with several modules of different port types. The ports could be different in speed and/or connector type. Each module may have a different number of ports. For example, a switch may have two modules of eight and six 155-Mbps single-mode fiber ports, respectively, another module with eight 155-Mpbs UTP ports and a fourth module with six 25-Mbps UTP ports. Figure B.3 shows an example IUT where the modules are grouped by type. The first group consists of two 25-Mbps UTP modules, the second group consists of two 155-Mbps single fiber modules. External connections may only be established between ports that are co-located within the same group (hence the constraint that modules come in pairs for inter-module connectivity). Figure B.3: Example Partitioning of Modules into Groups. Step 2. Connect the Generators and Analyzers to the Appropriate Ports A port must be reserved for each generator/analyzer that is to be used in the test. These reserved ports cannot be used in the next step that establishes external connections. The methodology presented here allows any given number, r 0.8 might require N ( 250. E.g., large positive correlation means that a data point above the mean is likely to be followed by another data point above the mean, so one might see a disproportionately large number of "elevated" sample points in a small, correlated sample. This will bias the sample variance and the asymptotic confidence interval bounds if N is too small. One can use either method (jackknife or asymptotic) for large samples, as both methods produce equivalent results. The dataset used for the ( = 0.5, N = 30 example is: {1.14323, 0.77446, -0.89094, -0.09006, -0.27767, -0.22702, 0.89675, 0.05671, 1.92046, 2.40715, 1.70900, 0.53860, -0.32551, -0.62455, 0.33333, 2.46235, 2.28505, 0.49709, -0.51621, 0.96350, -1.17857, -2.23387, -1.95727, -1.89923, -1.66045, -0.05656, 0.60535, 2.00016, 1.35338, 1.01044}. C.3.2. Generalized Jackknife Let be the nth element of the lth subsample. Let be the sample mean of the lth subsample. It is assumed here that the subsamples are approximately statistically independent. This will be true if, for example, and . That is, . Let denote the mean square of the jth subsample. That is, . Let (j=1, ..., B) be the unbiased sample variance of X with the jth subsample removed (). That is, . Let denote the sample mean of X and denote the sample variance of X. That is, Let , for Let be the sample mean of , and be the sample variance. That is, , for Given these definitions, has approximately a t-distribution with B-1 degrees of freedom. Thus, the (1-()100% confidence interval for is: . Appendix D: In-service and Out-of-service Measurement of QoS Parameters D.1. Introduction Performance testing deals with in-service and out-of-service measurement of QoS parameters under different load condition profiles. In Appendix D.2, the out-of-service measurements are described, whereas in Appendix D.3, the in-service measurements are presented. D.2. Out-of-service Measurement of QoS Parameters D.2.1. Introduction If the system under test can be taken out of service, then user cells can be replaced by test cells. These cells can be used together with a suitable measurement algorithm to measure the ATM performance parameters of the system under test. D.2.2. Test Cell Format The test cell format followed by ATM Forum is that provided in ITU-T Recommendation O.191 [9], shown in Figure D.1. Figure D.1: ITU-T Recommendation O.191 Test Cell The UN field (unused octets) can be used for proprietary purposes, provided this is indicated by the setting of the PPI (Proprietary Payload Indicator) bit in the T field (Test cell payload type). D.2.3. Test Configuration Since QoS parameters are defined as parameters that can directly be observed by users, the following definitions can be given: * System under Test (SUT) is an ATM network or an ATM switch. * Access for QoS measurement is the T or S reference point at the UNI. Test cells, as used in this appendix, can be either the test cell format described in Appendix D.2.2, or those used in the AAL-5 test frames as described in Appendix D.2.5. Figure D.2: QoS Measurement Arrangement The SUT is loaded by the following cell traffic: * Traffic at the Test Cell Input: * Test Traffic (A_Cells, unidirectional, called test cell stream) generated by the test equipment for the VP and/or VC used for testing purposes * Controlled Traffic (B_Cells, unidirectional) generated by the test equipment but for VPs/VCs not used for testing. * Real Traffic (C_Cells, bi-directional) generated/terminated by real subscribers connected to the UNI used for testing. * Traffic at the Test Cell Output: * Test Traffic (A'_Cells, unidirectional) terminated at the test equipment for the VP and/or VC used for testing purposes * Controlled Traffic (B'_Cells, unidirectional) terminated at the test equipment but for VPs/VCs not used for testing. * Real Traffic (C'_Cells, bi-directional) generated/terminated by real subscribers connected to the UNI used for testing. * Background Traffic (D_Cells and D'_Cells, bi-directional): traffic generated by real subscribers not connected at the Test Cell Input or Test Cell Output. In general, two types of test configurations exist: 1. SUT is fully under control by the tester. There is no real and no background traffic. Test traffic as well as the controlled traffic is generated by the test equipment. This configuration will allow reproducible test results to be obtained, because the QoS parameters very much depend on the overall traffic in the SUT. 2. SUT is loaded with real as well as background traffic that is out of control of the tester. It is difficult to get reproducible test results, mainly because the real and background traffic could lead to some overload conditions within the SUT. Therefore, it is necessary to measure not only the QoS parameters, but in parallel the load: * at the UNI of the Test Cell Input; * at the UNI of the Test Cell Output; * within the SUT. D.2.3.1. Test Cell Input The Test Cell Input is a T or S reference point. Two categories of cells can be generated: A_Cells that form the test traffic. Each cell contains: - VPI and VCI of a VP and VC established for testing purposes. - Correct header (HEC). - Cell payload containing an identification (e.g., a cell sequence number, a time stamp). - Payload has to be guarded by a CRC-16. Test traffic conforms to the negotiated traffic contract. Note that any non-conformity could introduce cell loss and therefore a significant decrease of the QoS. B_Cells that form controlled traffic. Each cell contains: - VPI and VCI different from VP and VC used for testing purposes. - Correct header (HEC). - Cell payload. D.2.3.2. Test Cell Output The Test Cell Output is a T or S reference point. Since the QoS parameters are based on events of cells, actual measurements can only be performed above the physical layer. Figure D.3: QoS Measurement Access The physical layer implemented in the test equipment may be considered as free of processing errors and without processing delay; or errors and delays in the physical layer should be taken into account in the calculation of the QoS parameter values. QoS measurement access will get only valid cells from the physical layer because cells with a faulty HEC arriving at the Test Cell Output will get the header corrected or will be discarded. Therefore, it will not be possible to analyze faulty cells with a HEC error in the ATM layer. At the QoS measurement access, the cell stream will be divided into the following two types of cells: 1. A'_Cells belong to the test traffic identified by the VPI or VPI/VCI. The following analysis will be done: * CRC of the payload: * If CRC is correct, the cell is assumed to be a valid test cell. In this case it is possible to analyze the cell identification in the payload. It shows that the cell is in the right sequence: - If yes, the cell is assumed to be a correct test cell (called a_cells). - If no, the cell is assumed to be a missequenced test cell (called b_cells). * If CRC is incorrect, the cell is assumed to be an errored test cell even if there is a possibility that the cell is a misinserted one (called c_cells). 2. B'_Cells do not belong to the test traffic but have to be taken into account (together with the real traffic) to calculate the total traffic at the Test Cell Output. D.2.4. Analysis of the Test Cell Stream D.2.4.1. Cell Error Ratio (CER) A continuous test cell stream containing A_cells is sent at the Test Cell Input. At the Test Cell Output, the A'_cells are analyzed during a time interval of t1 (service dependent, the value is for further study). CER is calculated as follows: CER = c / A' where: A': number of A'_cells (received test cells) c: number of errored A'_cells (c_cells: payload CRC is faulty) This method will lead to incorrect results if misinserted cells with payload bit errors are received (overcount of c and A'). D.2.4.2. Cell Loss Ratio (CLR) A continuous test cell stream containing A_cells is sent at the Test Cell Input. At the Test Cell Output the A'_cells are analyzed during a time interval of t1 (service dependent, the value is for further study). CLR is calculated as follows: CLR = (A - A') / A where: A: number of sent A_cells (test cells) A': number of received A'_cells (test cells) This method will lead to incorrect results if misinserted cells with payload bit errors are received (overcount of A'). In case of misinserted cells > cell loss, CLR will be negative. D.2.4.3. Cell Misinsertion Rate (CMR) A continuous test cell stream containing only B_cells is sent at the Test Cell Input. At the Test Cell Output no A'_cells should arrive otherwise they are misinserted. Therefore A'_cells are counted during a time interval of t2 (service dependent, the value is for further study). CMR is calculated as follows: CMR = A' / t2 where: A': number of received A'_cells (test cells) D.2.4.4. Cell Missequenced Ratio (CSR) ITU-T Recommendation I.356 doesn't define a cell missequenced ratio (CSR), because AAL controls missequencing of cells. Nevertheless, missequencing of cells is a faulty behavior of the ATM network that can be observed by the user. Therefore, an appropriate measuring method is recommended. A continuous test cell stream containing A_cells is sent at the Test Cell Input. At the Test Cell Output, the A'_cells are analyzed during a time interval of t1 (service dependent, the value is for further study). CSR is calculated as follows: CSR = b / A' where: A': number of received A'_cells (test cells) b: number of b_cells where the cell identification in the payload shows that the cell was overtaken by the previous cell (cell sequence error). Example: Cell number : 1 3 4 2 5 Counter value b : n n n n+1 n+1 D.2.4.5. Measuring Cell Transfer Times To be able to perform measurements of the cell transfer time, the equipment generating the test traffic and the equipment analyzing the received test traffic must be synchronized in the order of micro-seconds. Whereas CTD measurements require a synchronization in both frequency and absolute value (known offset), it maybe possible to measure CDV with only a synchronization in frequency. At the Test Cell Input, a time stamp (according to O.191) is inserted as cell identification in the A_Cells payload. At the Test Cell Output, only correct A'_Cells (a_cells and b_cells, no payload CRC fault) are analyzed. All the cell transfer time parameters are evaluated from the measurement of the Call Transfer Delay (CTD): CTD = tr - ts where: tr: receive time relative to the synchronized time reference when the cell reached the Test Cell Output side. ts: transmit time relative to the synchronized time reference when the cell left the Test Cell Input side. Note: CTD is one of the cell transfer performance parameters. It very much depends on the cell traffic in the test traffic, the controlled traffic and the real traffic, as well as on the background traffic. D.2.4.6. Mean Cell Transfer Delay (MCTD) Mean Cell Transfer Delay is the arithmetic average (mean) of the CTD measured over the time period t1 (service dependent, the value is for further study). MCTD(t1) = Scdt / a where: a: number of received and correct A'_cells (no payload CRC fault: correct test cells) Scdt: Summation of the CTD (tr - ts) of all correct A'_cells. D.2.4.7. Maximum Cell Transfer Delay Maximum cell transfer delay is the maximum value of the CTD measured over the time period t1 (service dependent, the value is for further study) of all correct A'_cells. MaxCTD = max (CTDi) i: 1, ... n where: CTDi: the cell transfer delay of the ith correct A'_cell received within t1. n: the total number of correct A'_cells received within t1. This procedure is based on TM 4.1 [2] (Section 3.6.1.1) with the effect of alpha being neglected. D.2.4.8. Peak-to-peak Two-point Cell Delay Variation Peak-to-peak two-point cell delay variation is the maximum value of the CTD minus the minimum value of the CTD measured over the time period t1 (service dependent, the value is for further study) of all correct A'_cells. peak-to-peak CDV = MaxCTD - MinCTD where: MinCTD = min (CTDi) i: 1, .....n with: CTDi: the cell transfer delay of the ith correct A'_cell received within t1. n: the total number of correct A'_cells received within t1. D.2.5 AAL-5 Test Frame Format The test frame is an AAL-5 frame made up of two kinds of instrumented test cells, Frame-body Test Cells and a single End-of-Frame (EOF) Test Cell. Figure D.4: AAL-5 Test Frame Format For a frame of length N cells, the first N-1 cells are Frame-body test cells. The Nth cell is an EOF test cell that differs from the frame-body test cells only in that the last 8 octets of the cell are shifted left 8 octets to make room for the AAL-5 trailer. A test frame that is only one cell long consists of a single EOF test cell. By composing test frames of test cells it becomes possible to make simultaneous, correlated ATM layer and frame layer performance measurements on the same traffic. Overall frame integrity is checked using the AAL-5 CRC-32 while individual cells retain the CRC-16 cell payload integrity check. Due to the use of the EOF test cell, the AAL-5 test frame does not conform to ITU-T Recommendation O.191. The AAL-5 test frame allows the following five benefits: 1. all of the test frame features necessary to implement the frame-level performance measurements specified in Section 4. 2. the ability to obtain simultaneous, correlated performance measurements at the ATM and Frame layers. 3. reuse at the frame layer of implementation engines developed for ATM-layer performance measurement. 4. detection of frame payload corruption (via test-cell integrity check) even for SUTs that regenerate the AAL-5 CRC-32 at their egress point. 5. optional, optimized runt frame detection, protecting against the potential miscounting of one errored frame for each runt frame received. Algorithms that correlate ATM-layer and AAL-5 frame layer measurements are for future study. D.2.5.1. AAL-5 Test Frame Test Cell Format The Frame-body test cell is based on that defined in Appendix D.2.2 with the addition of a 16-bit Frame Sequence Number field, long-word aligned at the tail of the UN field, prior and adjacent to the rsvd field. Figure D.5:: AAL-5 Test Frame Format Details The End-Of-Frame (EOF) test cell has a similar format to the Frame-body test cell except for the last 8 octets. In the EOF test cell, the UN field is eight octets shorter. This leaves room at the tail of the cell for the AAL-5 trailer. D.2.5.2. Frame Sequence Number The Frame Sequence Number is a sixteen-bit value that increments by one for each test frame transmitted. The value is encoded as high octet followed by low octet. Two additional octets are reserved to preserve long-word alignment and to allow for future expansion of the FSN. D.2.5.3. Frame Test Cell Sequence Number The cell sequence number increments by one for each test cell, is uncorrelated to the FSN, is not reset for each test frame and counts continuously across frame boundaries. D.2.5.4. Payload Type The TCTP comprises two fields: PPI and REV. PPI, which is the MSB of TCTP, stands for Proprietary Payload Indicator and must be set to 1 for all test cells in a Test Frame. D.3. In-service Measurement of QoS Parameters D.3.1. Introduction In operational networks, it is necessary for users and network providers to have available in-service monitoring tools to continuously evaluate the performance of the network and services. In-service performance monitoring procedures for ATM networks are specified in ITU-T Recommendation I.610 on OAM principles and functions. OAM performance management cells are inserted and carried in a user cell connection in the forward direction to allow measurement of performance parameters on an end-to-end or segment basis. In the reverse flow, these OAM cells report the monitoring results in the backward direction. D.3.2. ATM Layer OAM Flows ITU-T Recommendation I.610 specifies procedures for in-service performance monitoring using F4 OAM flows at the VP-level and F5 OAM flows at the VC-level. OAM cells for the F4 flow have the same VPI value as the user cells of the VPC and are identified by the preassigned values VCI=3 for segment OAM and VCI=4 for end-to-end OAM. OAM cells are generated and inserted at the originating point of the VPC or VPC segment. Intermediate points along the VPC or VPC segment may monitor passing OAM cells and insert new OAM cells, but cannot extract them. F4 OAM cells may be extracted only at the terminating point of the VPC or VPC segment. OAM cells for the F5 flow have the same VPI/VCI values as the user cells of the VCC and are identified by the preassigned values PTI=100 for segment OAM and PTI=101 for end-to-end OAM. OAM cells are generated and inserted at the originating point of the VCC or VCC segment. Intermediate points along the VCC or VCC segment may monitor passing OAM cells and insert new OAM cells, but cannot extract them. F5 OAM cells may be extracted only at the terminating point of the VCC or VCC segment. D.3.3. Performance Management Procedures At the F4 or F5 level, performance monitoring is done by inserting OAM performance management cells at the originating endpoint. In the forward flow, OAM cells carry information about the preceding transmitted user cell block for forward monitoring. Forward monitoring can detect errored cells, lost/misinserted cells, and optionally cell transfer delay. The performance monitoring results are reported in the reverse flow. At the originating endpoint, a performance monitoring cell insertion request is initiated after every N user cells. The block size N may have the values 128, 256, 512, or 1024. The monitoring cell is inserted at the first available cell location after the request, so the actual size of the monitored cell block may vary from the nominal block size. The cell block size may vary up to a maximum margin of 50% of the value of N for end-to-end performance monitoring, after which a forced insertion becomes necessary. The actual monitoring block size averages out to approximately N cells. Forced insertion for segment performance monitoring is an option. Performance monitoring can be activated either during connection establishment or at any time after the connection has been established. Activation and deactivation is initiated by the TMN or the end user. After the TMN or end user has requested activation/deactivation, an initialization procedure is needed between the two endpoints of the connection or segment. This initialization procedure serves to coordinate the beginning or end of the transmission and reception of OAM cells, and establish agreement on the block size and direction of transmission to start or stop monitoring. The initialization procedure can be carried out by either (a) using activation/deactivation OAM cells, or (b) entirely via TMN. Further details on activation/deactivation procedures can be found in ITU-T Recommendation I.610. Following performance monitoring activation, the first performance monitoring cell received is used for initialization only and is not used to update performance parameters. D.3.4. OAM Performance Management Cell ATM layer OAM cells contain fields common to all types of OAM cells as well as specific fields for each type of OAM cells. Common OAM fields are shown in Figure D.6 and listed below. * OAM cell type (4 bits): 0010 = performance management; * OAM function type (4 bits): 0000 = forward monitoring, 0001 = backward reporting; * function-specific field (45 bytes): described below; * reserved (6 bits): all zero; * error detection code (10 bits): CRC-10 computed over the OAM cell payload excluding EDC field. Figure D.6: Common OAM cell fields The OAM performance management cell will have the function-specific fields shown in Figure D.7 and listed below. * Monitoring Cell Sequence Number (MCSN) (1 byte): modulo 256 number to detect the loss of OAM cells; * total user cell number for CLP=0+1 user cell flow (TUC0+1) (2 bytes): in forward monitoring cell, this is the total number of user cells transmitted prior to insertion of this OAM cell; this number is copied into the TUC0+1 field of the corresponding backward reporting OAM cell. * Block Error Detection Code (BEDC) (2 bytes): even parity BIP-16 code computed over the information fields of block of user cells preceding this OAM cell; * Total User Cell number for CLP=0 user cell flow (TUC0) (2 bytes): in forward monitoring cell, this is the total number of CLP=0 user cells transmitted prior to insertion of this OAM cell; this number is copied into the TUC0 field of the corresponding backward reporting OAM cell. * timestamp (4 bytes): optional for the insertion time of this OAM cell (default = all 1's); use is for further study; * unused (29 bytes): all bytes are 0110 1010 (6AH); * Total Received Cell Count for CLP=0 (TRCC0) (2 bytes): in backward reporting cell, this is total number of CLP=0 user cells received prior to receiving the corresponding forward monitoring cell; * BLock Error Result (BLER) (1 byte): number of errored parity bits detected by BIP-16 code of the corresponding forward monitoring cell; * Total Received Cell Count for CLP=0+1 (TRCC0+1) (2 bytes): in backward reporting cell, this is total number of user cells received prior to receiving the corresponding forward monitoring cell. Figure D.7: Function-specific fields for OAM performance management cell D.3.5. Out-of-service Use of Performance Management OAM In addition to the in-service application, it is proposed that a basic method for out-of-service testing of ATM networks is defined in which end-to-end performance management OAM cells are transmitted and the user data cells are replaced with a dummy load. This dummy load could be a PRBS or fixed pattern. A PRBS would be the best load to use to check network transparency (that is to check that the network carries all bit patterns correctly). In addition to the measurements performed on the performance management OAM cells, a cell or bit error ratio measurement may be performed on the received payload if required. Performance management OAM capability may be designed into ATM network or end equipment. This is therefore a convenient test method to use if network management systems control the out- of-service test or if test equipment analyzes a test signal generated by a network element. D.3.6. Benefits of Performance OAM Technique Use of performance OAM out-of-service would have the following benefits: * A single test methodology for in-service and out-of-service testing ensures that results obtained in each situation will be comparable. * Because OAM cells arrive at a lower rate than user cells, there is more time available for processing measurements. This is particularly useful in processor-based measurement architectures. * Because the sequence number used is eight bits (as opposed to 36 bits in the out-of-service test cell), measurement processing is simpler. * Because of the widespread deployment of performance OAM techniques in ATM network elements, off-the-shelf devices are readily available to perform these functions. * Testing an ATM network using performance OAM tests the network's ability to carry the OAM flows as well as user data. * The backward reporting mechanism would provide a convenient method for test equipment to communicate results. * Where simultaneous measurements on multiple channels are performed, use of performance OAM allows a mixture of in-service and out-of-service testing to be performed. In this case, channels carrying live traffic can be monitored at the same time as test channels. * It is a simple test method that will result in economical implementation of low-end ATM test equipment. Because only the OAM cells carry a timestamp, the cell delay and cell delay variation (CDV) measurements obtained are only of a sampled nature. This should not be a problem when computing mean cell transfer delay, but some extremities of the CDV characteristic may not be seen. By using one-point CDV measurement techniques, the CDV characteristic is obtained for all cells. This is only relevant for a CBR distribution but this is where CDV measurement is mostly required. D.3.7. ATM Test Cell Use of performance management OAM out-of-service is a complementary test method to using the ATM test cell. Use of the test cell is necessary to isolate cell loss, misinsertion and mis-sequencing errors more precisely. Also, precise evaluation of cell delay and cell delay variation requires a more frequently transmitted timestamp than that provided by OAM cells. Appendix E: Examples of Frame Reference Load Models E.1. Introduction Appendix E.2 describes (in pseudo-code) how an abstract test case can be generated. Appendix E.3 contains several examples illustrating how parameters might be specified to accommodate a wide range of traffic sources. The representations should be considered as a definitional framework and as such are not related to any specific implementation. E.2. Pseudo-code Representation of an Abstract Test The role each configurable parameter plays is illustrated in the following pseudo-code representation of an abstract test. The references to function_of(...) mean that the parameter's value is obtained in some manner. The value may be a constant, or may be obtained from some stochastic process, probability distribution, empirical distribution, other representation of a traffic model, etc. This is a representation of an abstract test, and is intended to illustrate parameter roles and relationships. RLMs need not be structured this way. Examples are provided in the next section. begin test; number_phases = function_of(...); for I = 1 to number_phases do; /* generate phases */ number_active_periods = function_of(...); for J = 1 to number_active_periods do; /* generate active periods */ number_intervals = function_of(...); for K = 1 to number_intervals do; /* generate characteristic intervals */ number_frames = function_of(...); for L = 1 to number_frames do; /* generate frames */ frame_size = function_of(AAL,...); /* number of cells */ for M = 1 to frame_size do; /* generate cells */ transfer cell if (M < frame_size) then do; /* no gap after last cell */ inter_cell_gap = function_of(...); wait one inter_cell_gap; endif; endfor (/* frame_size */); if (L < number_frames) then do; /* no gap after last frame */ inter_frame_gap = function_of(...); wait one inter_frame_gap; endif; endfor (/* number_frames */); if (K < number_intervals) then do; /* no gap after last interval */ inter_interval_gap = function_of(...); wait one inter_interval_gap; endif; endfor (/* number_intervals */); if (J < number_active_periods) then do; /* no gap after last active period */ inter_activity_gap = function_of(...); wait one inter_activity_gap; endif; endfor (/* number_active_periods */); if (I < number_phases) then do; /* no gap after last phase */ inter_phase_gap = function_of(...); wait one inter_phase_gap; endif; endfor (/* number_phases */); end test; E.3. Examples Several examples are included to illustrate how parameters might be specified to accommodate various traffic sources. E.3.1. Simple Single-cell Test One possible set of parameters is: number_phases = 1 inter_phase_gap = 0 number_active_periods = 1 inter_activity_gap = 0 number_intervals = 1 inter_interval_gap = 0 number_frames = 1 inter_frame_gap = 0 frame_size = 1 cell inter_cell_gap = 0 E.3.2. Constant Frame Size, Constant Inter-frame Gap One possible set of parameters is: number_phases = 1 inter_phase_gap = 0 number_active_periods = 1 inter_activity_gap = 0 number_intervals = 1 inter_interval_gap = 0 number_frames = inter_frame_gap = a constant frame_size = a constant inter_cell_gap = E.3.3. Constant Frame Size, Variable Inter-frame Gap One possible set of parameters is similar to Appendix E.3.2, with: frame_size = a constant inter_frame_gap = variate from some traffic process, model, distribution, etc. E.3.4. Variable Frame Size, Constant Inter-frame Gap One possible set of parameters is similar to Appendix E.3.2, with: frame_size = variate from some traffic process, model, distribution, etc. inter_frame_gap = a constant. E.3.5. Variable Frame Size, Variable Inter-frame Gap One possible set of parameters is similar to Appendix E.3.2, with: frame_size = variate from some traffic process, model, distribution, etc. inter_frame_gap = variate from some traffic process, model, distribution, etc. E.3.6. Persistent (Greedy) Source One possible set of parameters is: number_phases = 1 inter_phase_gap = 0 number_active_periods = 1 inter_activity_gap = 0 number_intervals = 1 inter_interval_gap = 0 number_frames = maximum inter_frame_gap = one inter_cell_gap time frame_size = maximum inter_cell_gap = computed from allowed rate cell_transmission_rate = full allowed rate. E.3.7. Staggered Persistent Source This is similar to Appendix E.3.6, except that sources start in a staggered fashion. Characteristic intervals can be used to represent the delayed starts. E.g., source 1 starts during interval 1, source 2 starts during interval 2 and joins source 1, source 3 starts during interval 3 and joins sources 1 and 2, etc. Phases can also be used to represent the delayed starts. E.g., source 1 starts during test phase 1, source 2 starts during phase 2 and joins source 1, source 3 starts during phase 3 and joins sources 1 and 2, etc. E.3.8. Variable Load Source This is similar to Appendix E.3.6, except that frames contain one cell. Inter-cell gaps are exponentially distributed. E.3.9. Bursty Source Blocks of cells are represented by frames. The frame size is Bcells. Inter-frame gaps are exponentially distributed with mean IBI. Bursts and inter-burst gaps can also be modeled by active and idle periods, or by phases. E.3.10. Three-state Traffic Source A commonly used traffic source model consists of three states. The source is either active or idle. No traffic is generated while the source is idle. When the source is active, it generates a burst of frames interspersed with short pauses. The burst size is geometrically distributed with mean Np frames. Frame lengths are geometrically distributed with mean Nb bytes. Pauses between frames are exponentially distributed with mean Toff. The lengths of idle periods between active states are exponentially distributed with mean Tidle. The frame transmission time is Ton. One possible set of parameters is: number_phases = 1 inter_phase_gap = 0 number_active_periods = inter_activity_gap = exponential(Tidle) number_intervals = 1 inter_interval_gap = 0 number_frames = variate ~ geometric(Np) inter_frame_gap = variate ~ exponential(Toff) frame_size = fixed, or computed from variate ~ geometric(Nb x 8) inter_cell_gap = cell_transmission_rate = derived from Ton . Appendix F: Examples for Reporting Test Results This appendix provides two tables for reporting test results. The first table refers to all the parameters concerning the configuration, describing the SUT from the HW and SW points of view and also what testing equipment configuration is used during the verification. Diagrams showing the configuration are encouraged. The second table collects instead traffic-related parameters, such as the RLM used for foreground and/or background traffic, the metrics considered and its measured value. Other parameters may be added or a different layout for the table may be used. Configuration Parameters Configuration parameters/test cases Number of ports Rate of each port Number of ports per network module Number of network modules Number of network modules per fabric Number of fabrics SW version Test equipment Traffic replication Inter / Intra module Test case 1 8 155 Mbps 4 2 2 1 R1 X Serial Intra Test case 2 4 34 Mbps 2 2 2 1 R2 X Serial Intra Test case 3 Traffic Parameters Traffic Parameters/ test cases Type of traffic Established points Connection configuration Service category RLM parameters Metric measured Value (Optional) Performance enhancing schemes & traffic management functions Test case 1 Foreground (1 permanent VCC) Between ports on Different network Modules n-to-n straight (n=8) UBR RLM1 80 Mbps MFL = 100 Throughput (peak) Y Background (n-1 permanent VCCs , n=8) Between ports on Different network modules n-to-n straight (n=8) CBR RLM2 40 Mbps MBL = 50 (*) (*) (*) The SUT performance is measured only on traffic carried by the foreground channel, so that metrics and their measured values are not of interest for connections carrying background traffic. Appendix G: Simple Statistical Methods Recipe This appendix provides a simple usable recipe for using statistical methods when measuring performance. G.1. Confidence Interval Estimation G.1.1. Purpose By definition, an estimator is not expected to yield the exact value of the parameter that we are trying to estimate (e.g., the mean), but it should usually be equal to this value plus or minus some sampling error. If this sampling error is large, the measured mean may have little value to the performance analyst. For instance assume some measured values for throughput are 2, 2, 2, 18, 18 and 18 frames per second. The mean of this sample is 10 frames per second. Given the large variation in the data, this sample mean may not accurately reflect the true mean. If the test was run longer, it may be the case that all future measurements are 18 frames per second, and therefore the sample mean would approach 18 frames per second in the larger sample. There are simple statistical methods that estimate how close the measured mean is to the true mean. We can use the estimator's variance and sample size as a measure of this sampling error to help us judge how precise our estimation is. We want an interval of values in which, up to a certain level of confidence, the estimated parameter (mean) is included. This is referred to as a confidence interval for our estimated parameter. G.1.2. Confidence Interval for the Mean This section explains how to build a confidence interval for a distribution's mean. The underlying theory for confidence intervals is beyond the scope of this document. Any number of introductory statistics texts can provide the reasons for this methodology. G.1.2.1. Conditions The distribution's mean, , and variance, , exist (i.e., not infinite). The observations are i.i.d. (independent and identically distributed). If you suspect your observations to be correlated, it is recommended to use one of the several methods that exist to work around this correlation. See, for example, the batch mean method. n is large enough. In theory, n > 30 is often considered large enough. However, the correct value of n depends heavily on the original distribution. The more symmetric and Gaussian-like it is, the faster the convergence will be and n can be smaller. The more asymmetric, skewed and noisy it is, the slower the convergence will be and the larger n must be. G.1.2.2. Confidence Interval If our conditions are satisfied, then we have the following asymptotic confidence interval for : is included in with a confidence level of where is the quantile of the Gaussian distribution with a mean of zero and a variance of one. G.1.2.3. Walkthrough 1. Collect n observations and compute the sample's mean, , and variance, . or, equivalently, 2. Choose a level of confidence for your confidence interval and find the corresponding . The most common values for a confidence level are 90%, 95%, 99% and 99.9%. The of these values appears in the table of Appendix C.1 (wherein confidence level is listed under the heading "Confidence" and is listed under the heading "Quantile"). If you want to use another value for , you can find the corresponding in any introductory statistic book and in many statistics-capable software packages. 3. The confidence interval at level for is then G.1.3. Confidence Interval Using the Bootstrap Method G.1.3.1. Purpose The bootstrap method is an alternative method for confidence intervals. It is useful for cases where the estimator's variance is hard (or impossible) to compute, when a sample is too small, or the sample is not well behaved enough to justify using a Gaussian approximation. The method consists of approximating the real distribution by the collected sample. From this sample we generate many "virtual" samples and compute an estimator for each of them. G.1.3.2. Conditions The bootstrap method supposes that the sample's observations are i.i.d. (independent and identically distributed). If the observations are correlated, it is recommended to use one of the several methods that exist to work around this correlation. See, for example, the batch mean method. G.1.3.3. Walkthrough 0. Determine ?? the parameter you wish to estimate, and ??its estimator. For instance the sample mean . 1. Collect your sample of n observations. The choice of n is arbitrary, however the larger n is, the more precise the approximation of the real distribution. 2. Choose a level of confidence for your confidence interval. The most common values for a confidence level are 90%, 95%, 99% and 99.9%. 3. From the original sample, draw n observations with replacement. The probability of drawing any one observation should be 1/n (equally likely). This is commonly referred to as a discrete uniform distribution. Note: yes, the n here is the same as in step (1.). 4. Calculate ?for a bootstrapped sample. Keep this value in memory. 5. Repeat (4.) and (5.) m times. The number of repetitions m should be fairly large (e.g., 1000 times). 6. Sort the obtained estimators in ascending order. Let be the sorted list of estimators. 7. Define +1 and where means round down and round up. 8. The bootstrapped confidence interval with level is then G.2. Batch Means G.2.1. Purpose Many statistical inference procedures require that the observations are i.i.d.. Unfortunately, observations coming from most samples are not independent, but are heavily correlated. The batch means method is a simple way to get around correlation and build a set of asymptotically independent observations from correlated (not independent) samples. G.2.2. Condition For the batch means method the main condition is that the observations have no long-term dependency. The correlation between two variables should approach 0 as the distance between them increases. In other words, the ability to predict Xt+k if we know Xt should decrease as k increases. This condition can be violated, for example, if the system under test has a cyclic behavior. Another condition is to have "enough" observations. "Enough" depends of how well-behaved the original observations are. However, a hundred observations should be deemed a reasonable minimum for most experiments. G.2.3. Walkthrough Supposing that the hypothesis of no long-term dependency between the observations is verified, then the method can be applied as follow: 1. Let x1, x2, ..., xn be the original sample. 2. Take b and k, two integers such as b x k = n. Usually one takes b and k as balanced as possible (i.e., ). The number b is the number of batches, and k is the number of samples in each batch. 3. Divide the sample's xi into b successive batches of k observations and computes their mean. That is: 4. If there is no long-term dependency among the observations and if k is big enough, then the batch means are asymptotically independent. 5. Take the calculated batch means as a new sample of size b and calculate a confidence interval using the method described in Appendix G.2.1. G.2.4. Discussion It should be mentioned that the batch mean method is a heuristic. That is, its efficiency is not analytically proven. However, it is a method widely used that is known to give good results, as long as the conditions are satisfied and b and k are large enough. The method can be justified as follows. Since we assumed that the correlation between successive observations decreases as the distance between the observations increases, the covariance between two successive batch means is mainly composed of the covariance of the observations at the fringes of each batch. By increasing the size of a batch, one reduces the relative importance of these observations. If the batch size is large enough, then the contribution of the fringes will be negligible. G.2.4.1. How to Choose b and k? From the paragraph above, the greater k is, the less dependant the batch means will be. However, if k is too big, we will have a very small final sample of b observations, which can cause a problem if we want to build a confidence interval. The optimal equilibrium between k and b is different for each experiment, but it is common practice to take k and b approximately equal. END OF DOCUMENT AF-TEST-TM-0131.000 Final Ballot ATM Forum Performance Testing Specification August, 1999 ATM Forum Performance Testing Specification AF-TEST-TM-0131.000 October, 1999 Page i of v ATM Forum Technical Committee ATM Forum Technical Committee Page vi of vi Page 1 of 105 ATM Forum Technical Committee ATM Forum Technical Committee Page 99 of 99