Jim Rigsby and Associates

 

A DISCUSSION ON VoIP

Peter Lupica
Copywrite 12/15/2003

 

Some forty years ago, the telephone companies began to introduce digital carrier systems (DS1 or T1) and thus established the digital carrier hierarchy that is still prevalent today. These systems digitized voice, or analog communications, into as a series of 1's and 0's so in essence, "it looked just like data". In fact, since the beginnings of digitized voice, there was little debate that somehow voice bits were different from data bits. This digitization offered numerous advantages, primarily from the standpoint of reducing/controlling transmission impairments.

Voice does however have some unique characteristics when compared to data. Chief among them is the fact that voice communication is time sensitive in that each packet has to arrive not only in the proper sequence (order) but also within a certain time frame. When using our PC's, most of us have probably experienced 'hung' applications or what would be a seemingly inordinate delay in response time. One need only imagine how a similar situation would impact a voice call. Due in part to this, voice communications has remained a circuit switched application. Conventional TDM (Time Division Multiplexing) was certainly a reliable means to support voice communication however a 'disadvantage' of TDM is that the circuits or paths, are idle until they were required to service another connection. In the case of DS1 (T1-Carrier), there are 24 time slots or paths, each of which can support one connection for a voice call. With fluctuating traffic, it is not uncommon for the facility to be, on average, largely idle.

With regard to data communications, the later situation is somewhat similar in that contemporary networks are generally underutilized with the average occupancy being 10% - 15% or less. However, their architecture is fundamentally different in that the entire facility is used for communication, analogous to a 'party line', with the protocol keeping track of which messages are intended for which users.

Since most organizations currently have a data communications network as well as a voice communication network, it would appear that savings, in some cases on a rather significant scale, could be realized from having only one network. This would be the case not only from the MAN/WAN perspective but also from the perspective of the LAN for local distribution or cabling. This factor, along with the promise of futuristic applications, has largely fueled the movement toward a "converged" network. In many circles, IP communications is believed to be the next generation of networking technology that will be used to combine voice, data, video, wireless, and multimedia applications into a single integrated enterprise infrastructure. It not only holds the promise of offering much more efficient use of bandwidth by having voice and data share the same connections/networks but also the capability of handling all types of traffic and to deliver more services than were available with separate voice and data networks.

VoIP, as it is generally referred to, stands for "Voice over Internet Protocol". Some also refer to it simply as 'voice over' or just IP. There is also some interest devoted to accommodating voice over a Frame Relay network. That is referred to as VoFR. Regardless of what it is called, it means taking the digitized voice signals, assembling them into a protocol data unit, and transmitting them over a shared facility. The front runner for the protocol of choice is the Internet Protocol. What makes IP a good choice besides the fact that it has become the universal standard for enterprise networks, is that it operates over a wide array of physical networks from Ethernet LANs to MAN's to WAN's almost regardless of the underlying transport mechanism. IP is able to do this because the application is unable to "see" the physical network details, and the protocol provides a consistent user interface.

There appears to be little doubt that this technology will eventually become commonplace. The reason is that it has widespread vendor support. Most, if not all of the major data hardware vendors, (notably Cisco Systems), are solidly in this camp. PBX vendors are beginning to alter the fundamental switching fabric of their systems - away from TDM and towards packetized voice using IP. The major communications carriers, such as Sprint, AT&T, MCI, SBC, Verizon, etc. are beginning to deploy IP based networks to carry voice communications. Their impetus is derived from the potential cost savings that could be realized by more fully utilizing their network infrastructures and by offering network-based applications which could not be easily implemented just a few years ago.

The evolution of VoIP seems to be inevitable. It appears that this is all you will be able to buy in a matter of a couple of years. However, deploying, or attempting to deploy a VoIP network is not without its pitfalls. Voice is more than just an IP network application. It is a fundamental business and consumer service that has for a century been delivered on a daily basis with predictable quality. When VoIP technology is deployed for voice services, users will both expect and need service quality that matches that of the Public Switched Telephone Network (PSTN). Voice, being a real-time application, requires special QoS (Quality of Service) considerations that are not needed by data. Being time-sensitive, voice has a low tolerance for delay, and an even lower tolerance for delay variance or jitter. In addition, voice applications generally have a low tolerance for packet loss. Since voice most often utilizes UDP (User Datagram Protocol), there is no real 'end-to-end' connection as this would essentially defeat its' purpose. That means that a lost packet means lost data; there are no re-transmissions.

While it is well-known that the IP network performance parameters that impact voice are packet loss, delay, and jitter, the type and degree of impact that these parameters have on voice quality is lesser known. This is because there are many other VoIP processes that impact voice, and these various processes, together with IP network performance; influence each other in complex ways to affect overall voice service quality.

A good IP Communications system is standards-based. However, as the saying goes - the thing about standards is that there are so many from which to choose. A standards-based system, while not absolutely guaranteeing interoperability, goes a long way toward insuring interoperability among and between the products and services of different vendors. The difficulty at this point is that these standards are still emerging. This fact does not lend itself to implementing a system that is easily installed and maintained especially when considering that the system should also allow a network to be upgraded or migrated in stages while still being able to interoperate with existing 'legacy' systems whether they be voice or data.

It appears that the current standards efforts that are receiving the most attention are addressing issues such as call set-up and disconnect procedures (i.e. H.323 vs. SIP) and schemes aimed at 'enhancing' packet delivery mechanisms (i.e., MPLS vs. DIFFSERV). While these individual areas will in fact have an impact on VoIP, in our opinion the critical area of concern has to do with the somewhat elusive concept of "voice quality" or "call quality". Thus, it warrants further discussion.

The two key parameters of voice service quality most affected by IP network performance and VoIP processing are voice clarity (also known as speech quality) and voice delay. Voice clarity depends on many factors in addition to packet loss and jitter, and the various factors influence one another. It is vital that the specific impact of these parameters be known before judgments are made. For example, a certain degree of packet loss can have varying affects on clarity so it may prove to be unwise to invest in QoS technology to overcome a perceived packet loss problem, if packet loss does not appear to affect voice quality. Voice delay includes more than just IP packet transmission delay. Delay can be introduced from a number of sources including VoIP gateway processes such as codecs and jitter buffers. High packet jitter can add to delay by increasing a gateway's jitter buffer size requirements. Actual packet delay will simply add to this. Thus, it is vital to know what the end user delay experience will be, and this can only be accomplished with active voice delay measurements. Knowing how an IP network will perform in terms of these important end user service parameters, and in terms of the underlying factors of packet loss and jitter, is very valuable prior to making critical decisions and investments regarding a VoIP deployment. This is the primary purpose of a pre-VoIP network assessment (discussed below).

Data network performance is usually reported on using several metrics since there are many factors to consider. However in the "voice" telephony world call quality measurement has traditionally been subjective and is accomplished by listening to the quality of a voice call. The leading subjective, single metric measurement of voice quality is the MOS (mean opinion score). This is derived by having a group of people listen to the call and give their opinion of the call quality on a scale from 1 to 5 with 5 being best.

Since VoIP is a data network application the MOS method leaves much to be desired in measuring call quality, which is at the very heart of the matter and perhaps the most important criteria to consider. Progress has been made in establishing objective measurements of call quality. Again, various standards have been developed and espoused:

  • PSQM (ITU P.861) / PSQM+: Perceptual Speech Quality Measure
  • MNB (ITU P.861): Measuring Normalized Blocks
  • PESQ (ITU P.862): Perceptual Evaluation of Speech Quality
  • PAMS (British Telecom): Perceptual Analysis Measurement System
  • The E-model (ITU G.107)

PSQM, PSQM+, MNB, and PESQ are part of a succession of algorithm modifications starting in ITU standard P.861. British Telecom developed PAMS, which is similar to PSQM. The PSQM and PAMS measurements send a reference signal through the network and then compare the reference signal with the signal that's received on the other end of the network via digital signal processing algorithms. These measurements are frequently found in test labs and are used primarily for analyzing the clarity of individual devices such as a telephone handset. Vendors that implement these algorithms then map their scores to MOS. However, these approaches are not really well suited to assessing call quality on a data network. The models used are not based on data network issues, so they do not lend themselves to mapping back to the network issues of delay, jitter, and datagram loss. Also, they aren't suited to the two-way simultaneous flows of a real phone conversation, and they don't scale to allow evaluation of the quality of hundreds or thousands of simultaneous calls. On the other hand, the "E-model" (ITU G.107) is a complex formula that calculates a single score called an "R factor". Once an R factor is obtained, it can be us to calculate an estimated MOS. R factor values range from 100 (excellent) down to 0 (poor) whereas a MOS can range from 5 down to 1.

While it is beyond the scope of this paper to present a detailed discussion of all of the attendant protocols and issues nor to offer any sort of an endorsement, it is our considered opinion that there is little doubt that the critical issue of "voice quality" must be thoroughly addressed as it can either "make" or "break" a successful VoIP implementation.

The essential starting point, if one is to seriously consider a VoIP implementation, is assessing the IP network for expected VoIP performance, prior to VoIP deployment. This is required in order to determine what needs to be done to the IP network, and what VoIP systems and architectures will be needed in order to take advantage of the particular IP network that is in place. This will enable an organization to put in place the appropriate IP network architectures, configurations, and possibly QoS mechanisms, to guarantee voice service performance. It will also enable the organization to select the optimal VoIP systems and architectures needed.

In order to develop this guidance it is essential that a complete and comprehensive assessment be conducted in order to insure that nothing is overlooked. It goes without saying that the embedded infrastructure must be thoroughly documented and reviewed - routers, bridges, switches, distribution, gateways, firewalls, servers, etc., etc. Beyond that however, it is not enough to simply measure IP packet loss, delay, and jitter. Knowing these performance parameters, while establishing important reference points, will not provide an adequate indication of how well a voice service will perform. As indicated above, one must also know how these parameters affect voice clarity and delay.

An adequate pre-VoIP network assessment must benchmark a network's performance in terms of voice clarity and delay, as well as packet loss and jitter. Also, actual end user voice delay should be measured, rather than just IP packet delay. This provides a complete and comprehensive assessment of VoIP network performance, ensuring that critical end user parameters are known prior to designing and deploying VoIP services.

This part of a VoIP readiness assessment is usually done in steps, starting with a simple test and getting more advanced.

  • One call - determine the voice quality of a single call, in two directions
  • Many calls - determine the voice quality of each call, during peak call volume
  • Many calls on a busy network - determine the voice quality of each call, during peak call volume with heavy background traffic

It is important to understand the results at each step before continuing. For example, if the voice quality is 'low' on successive single VoIP calls, then a determination must be made as to the degree to which the underlying network attributes are contributing to the situation. Only after completing the third step, with documented support that voice quality will be acceptable, would an organization be ready to proceed with VoIP deployment.

Most basic pre-VoIP assessments can be performed using some rather straightforward techniques. While the following is not intended as an exhaustive treatment of an extremely important and somewhat complex area, it will serve as a reference for the reader when considering the extent of a through network assessment.

  • Measure voice clarity and delay between each site at which VoIP will be deployed. If trended measurement results fall within the thresholds of acceptability, refer to packet loss and jitter measurement results for acceptable baseline values. These baseline values should be maintained for acceptable service quality when VoIP services have been deployed. However, when actual VoIP services have been deployed, clarity and delay testing should be repeated to certify the deployment. If measurement results indicate potential quality problems, refer to packet loss and jitter measurement results for indications of possible causes.
     
  • Measure VoIP Packet Loss and Jitter for more precise determination of causes of poor voice clarity, or to baseline the performance parameters of the IP network under conditions of acceptable service quality.
     
  • Measure Voice Delay. Round-trip voice delay measurements are valuable because they more accurately characterize a user's experience with regard to delay. A telephone user perceives round-trip voice delay, not one-way voice delay. That is, the delay for a speaker's voice to reach a listener's ear is perceptible to neither speaker nor listener. However, the delay between a speaker saying something, and then hearing the other person's response, is perceptible.
     
  • Baseline Performance with Trending in order to baseline the network's performance over time, and to determine any variance in performance due to network usage, perform clarity and delay.
     
  • Perform assessments with VoIP Equipment. One may need to assess a network using a particular VoIP gateway. Generate calls on analog FXO, analog E&M, T1, E1, and ISDN PRI telephony interfaces. The same techniques described previously for testing clarity and delay can be used.
     
  • Assess Performance Against Background Traffic. It is valuable to assess the performance of a VoIP network against a background of actual traffic. In a converged network, this would include both voice and data traffic.
     
  • Test for Echo. Echo is usually the result of an impedance mismatch on analog two-wire to four-wire hybrid junctions. Echo impacts conversational quality with the degree of impact being proportional to the echo signal's level and delay. The greater the echo signal level (or lack of echo return loss), and the greater the echo signal delay, the greater the impact on conversational quality. A call originating on a VoIP network may terminate to a PSTN two wire analog line. The two-wire to four-wire conversion will generate an echo.

So, yes, VOIP is the "real deal", but the pathway to successful implementation is tedious and detailed. Research has shown that it is very likely that your data network will not deliver the call quality you would like. A recent estimate predicted that 85% of today's router-based data networks are not ready for toll-quality VoIP calls.

Those of you paying attention will notice that I have not mentioned SECURITY. Security will be the topic of another paper.

 

[July 2006 Newsletter][Sept 2004 Newsletter]January 2004 Newsletter ] A Discussion On VoIP ] Voice Over... ] Health Care Solutions at Work ] ISDN/PRI For PBX Trunking ]

 

Image Map

 
P.O. Box 2710  North Canton, Ohio 44720
Phone: (330) 284-0340
jerigsby@jimrigsbyassoc.com