We seem to get a lot of questions about why our speech recognition engine can’t transcribe any random thing you’d like to say. There are two things working against the speech recognition engine: first, open speech transcription is a very hard thing to do without training. I doubt most callers would be willing to hang out for a few minutes training the IVR to understand what they’re saying. Even the best desktop speech recognition systems require keyboard-based user intervention for corrections. This level of preconfiguration and interaction is simply impossible to expect from users who want to quickly get information over the phone.

Second, the public switched telephone network is not a high-fidelity audio system. Your CDs store music at 44kHz, 16-bit, stereo. The PSTN transmits audio at 8kHz, 8-bit, mono. That’s a 22-to-1 difference. The mic on your PC usually records at 44kHz, 16-bit, mono. That’s still 11-to-1. An IVR is at a huge disadvantage relative to a PC in being able to process speech. It’s also why saying letters to an IVR is a difficult task. Humans have difficulty differentiating between “b” and “p”, or “t” and “d” — and we have the advantage of being able to divine context out of what’s spoken. If people have trouble with spelling over the phone, you can bet a speech recognition doesn’t stand a chance.

Interactive Voice Response (IVR) can affect an entire organization in many ways. A main feature of an IVR system is that you can use it for a majority of calls for your business. This will help minimize the workload and stress on your skilled personnel and allow them to focus on callers that genuinely need their attention. A benefit of this is that a company can considerably minimize overhead and expenses while enabling employees to be more productive.

Another benefit is that IVR automation allows people to contact your company 24 hours a day, 7 days a week for every week of the year. Between this and the advantage mentioned above, your company can gain the important competitive advantage of superior customer service.

You can use IVR for retailer and manufacturer account inquiries or their order status inquiries, shipment locator, pricing information, and information/literature requests. If you are a financial service you can use IVR to provide account balances and history, transfer of funds, bill pay, interest rate inquiries, loan payment calculators, loan applications, credit card activation, and even branch/ATM locator.

Or you can use IVR for healthcare claims, eligibility inquiry, open enrollment, physician referral, test results reporting, facility and patient scheduling, or pre-admission certification. And the same goes for other insurance policy claims and/or questions.

Finally, you could use IVR to help people find out about their Human Resources benefits status, open enrollment, training registration, time and job reporting, job postings; a government tax refund status, tax filing, ticket inquires and payments, or assessor inquiries; and utilities billing questions, outage reporting, emergency notification, collections notification, or service on/off scheduling.

Regardless of what type of business you are involved in, chances are you can use IVR to lower your business costs, while increasing your employees’ performance.

We’ve got a boatload of bonus survey features for you.

First, a few new UI features:

  • Global speechrec disable — If your respondents are going to be calling in from extremely noisy environments, you can now shut off speechrec for your survey. All of the connective language and built-in prompts will be adjusted accordingly to make your survey DTMF-only.
  • Simple report download — You can easily download your respondent records by referencing a URL with some POST variables filled in. Advanced developers can take advantage of this in their batch download automation scripts.
  • Audio recording tool — Instead of recording audio files and uploading them individually, you can now use our audio recording tool which calls you and walks you through the process of recording all of your prompts right over the phone.

And, finally, one new feature that enables real-time data integration: the “webservice” question type. You can now add a webservice question into your survey that, in a sense, asks your backend SOAP web service a question and then branches based on the response from the web service. This is a very advanced feature that most people won’t use, but for the few that need real-time integration, it opens up a wealth of possibilities. More to come next week on this capability.

Contact your account rep if you want more information or want to play with the new Plum Survey Builder.

IVR is an acronym for Interactive Voice Response. It’s a technology that automates interactions with telephone callers. Basically IVR integrates a company’s telephone and computer system, allowing the computer to become a voice computer. This then transforms a caller’s telephone into a terminal capable of directly accessing information and services.

IVRs answer a query one of three ways - by prompting callers to enter data using the touchtone keypad, by looking up the record in a database and by speaking back information. They can also ask the caller for information, accept the answers as they are entered on the keypad and store the information in a database.

Not surprisingly, many businesses have turned to IVRs as a way to decrease the cost of sales, service, collections, inquiry and support calls to and from their company. IVR solutions enable users to retrieve information including bank balances, flight schedules, product details, order status, movie show times, and more from any telephone. Additionally, IVR solutions are increasingly used to place outbound calls to deliver or gather information for appointments, past due bills, and other time critical events and activities.

If you are a business considering using IVR you will require a few essentials. First you’ll need an IVR platform. You’ll also need the IVR applications, back-end servers, telephone infrastructure and IVR experts to help you with the technology. But, you needn’t look any further than Plum Voice. We offer a combination of our own VoiceXML technology and complete professional services and we deliver telephony automation solutions that enable companies to exceed their goals and streamline their processes.

Assuming you arrive at this blog through our homepage, you will have already noticed the new updated look. The changes are more than cosmetic however. We’ve revamped, rewritten, and reorganized our content to provide better and clearer information to our customers. We plan to soon include a page with some good IVR port capacity planning tools on it as well as convenient login widgets for our standard VoiceXML hosting service as well as our survey product.

Hope you like it.

Looks like I took a bit longer getting this list of new features written out for everyone to see.

We’re shooting to deploy the following features by the second week of October:

  • web service question type — Sends just the responses from the current page and awaits a single string from a user-defined web service. This new feature alone opens up a world of integration possibilities. Do you need to validate a respondent’s identity against your database? Ask them to enter some digits and then follow that up with a web service question. Need to apply some skip logic based on the answers to several questions and not just a single multiple choice query? The web service question sends all of the respondent’s answers on the current page to your web service.
  • pull/poll download interface — Downloading a CSV of all your responses for a given time-period is a completely manual process right now. We’re working on a simple-to-use script that can be accessed with HTTP GET variables specifying the account, time period, and survey so it can be embedded in automation tools.
  • global speechrec disable — As cool as speech recognition is, not everyone needs it (as I discussed in my second post ever on this blog). Sometimes shutting it off would simplify things greatly for the survey designer so we’re adding a flag to the survey configuration screen that allows for this.
  • pass back audio conversion errors — Our current audio upload interface fails silently if you attempt to upload a file in a format that we don’t recognize. We need to start passing back error information so you can better debug audio recording issues.
  • outbound audio recorder convenience tool — Finally, the last thing on our list is a way for our customers to not have to worry about audio formats at all. You’ll be able to go to the survey builder’s audio manager, select the prompt you want to start recording and enter your phone number. Our platform will call you up and guide you through the process of recording prompts.

If we get any of these features done, tested, and deployed early, I’ll be sure to post that news here.

The one feature for our survey platform requested more than any other is the ability to automatically spam call a list of phone numbers to answer a survey. It’s one thing to ask people to click on a link and fill out a web survey. They have to be in front of a computer to do that. It’s another thing to ask them to call into a number and answer some questions over the phone. They can be anywhere to do that, but, hey, they might not be in the mood to bother.

It’s a whole different thing to call them.

As mentioned in a previous post, web surveys can frequently present skewed data because the respondent set is self-selecting. Calling people up at home is still the most reliable way to get someone’s attention. Once someone has picked up the phone, the marginal effort to then answer a handful of questions is usually low enough that most people will suffer through the process.

Anyway, we’ve been hard at work here at Plum and we just finished designing, building, and testing this new capability. The coolest thing about building a new product is being able to quickly iterate it based on requests from real users. Of course it helps a great deal to have feedback off of which to work otherwise you’re designing in a vacuum. It also helps to have a team here at Plum capable of quickly iterating a complex product like Plum Surveys. Fortunately we’ve been getting the feedback and we have the team, so it’s been fun watching this particular product evolve.

The next dev cycle for Plum Surveys concludes the first week of October. I’ll let you in on the new features next week.

Last week I talked about audio encoding formats but did not address how the decoder knows how the encoder encoded the audio (try saying that ten times quickly!)

There are really only two ways to address this issue. Method one is encapsulating the data with the encoding descriptors. Method two is…to guess.

The only encapsulation file format supported by the Plum IVR is Microsoft .wav which is derived from a format called RIFF. People often think that Microsoft .wav is both a file format and an audio encoding format. It isn’t. .wav/RIFF is independent of the audio encoding. Without getting into too much detail, you can think of the .wav/RIFF format as merely an envelope; the data enclosed within the envelope can be encoded any number of ways from PCM or u-law (as mentioned last week) to MP3 to various proprietary audio encoding formats. Thus, it’s important if you are going to create a .wav file that you also make sure that the audio is encoded using one of the formats mentioned last week.

That all said, you could also just send the IVR raw audio data and have the IVR guess at the format. You do, however, have to give the IVR a bit of hint in the form of an appropriate file name extension. If you encoded your audio data as 8kHz 16-bit PCM mono, just slap a “.pcm” on the end of the filename and the IVR will assume that’s the format. On the other hand, if you recorded your audio data as 8kHz 8-bit u-law mono, add “.ul” to the end of your filename. These types of files are often referred to as “raw, headerless” files because there’s no metadata whatsoever in the file — it’s all pure audio data. The downside to this is that there’s nothing to stop you from recording 11kHz 8-bit PCM stereo but still naming the file “whatever.pcm”. The IVR will load it, assume the data is another encoding format, and produce some noisy garbage over your phone lines.

One final thing to mention are MP3s. The Plum IVR can handle MP3s just fine, however, we often hear complaints about the decline in audio quality between what someone hears when their MP3 is played over their headphones and what is ultimately heard over the phone. Bear in mind: the phone system was never intended to transmit high-fidelity audio. That’s why we usually recommend the lossless formats instead because ultimate sound quality can be better controlled by the application developer when what he or she hears through headphones closely matches what they would hear over the phone.

So what would I recommend as the audio encoding format and file encapsulation format? We usually recommend .wav encapsulation of a 16-bit linear PCM, 8kHz, mono audio file. A) the file is self-describing, and B) “16-bit linear PCM” is common to all audio production software. Ideally we’d prefer to recommend u-law instead of 16-bit linear, but u-law often confuses people because it’s sometimes referred to as “mu-law” or sometimes “μ-law”. As usual, our support forum at http://support.plumgroup.com/ is always there to help you work out any audio production issues you might have.


Creating pre-recorded audio files is a complicated and involved process that’s exacerbated by the fact most people don’t have a firm grasp of how an audio file format is specified in the first place.

When audio is recorded on a computer, it is encoded as a series of numbers that, when read and decoded by the IVR, can be converted back into sound. In order for this data to be encoded and then, in turn, successfully decoded and converted into sound, the encoder and decoder both need to agree on a set of descriptors for what the numbers represent.

The typical descriptors for an audio file recorded in a non-lossy format are as follows:

  • audio format: linear PCM, u-law, a-law are all examples of audio formats which each specify a different way to map from a numerical data point in a file to a real sound generated by a speaker.
  • bit depth: the number of bits used to specify each data point. Linear PCM, for instance, is usually 8 or 16 bits. u-law and a-law are always 8-bits.
  • number of channels: 1 for mono, 2 for stereo, etc.
  • frequency: the number of data points written to the audio file per channel per second. This is measured in hertz (Hz)

The Plum IVR can handle audio files that are 16-bit linear PCM, 8-bit u-law, or 8-bit a-law, single channel (mono) recordings sampled at 8000 Hz. These descriptors are important for IVR for a couple reasons. First, if you try to use an audio file that was not recorded with an acceptable encoding, the Plum IVR will not be able to play it. Second, when you initially record your file, it’s always preferable to record it in one of these formats so you won’t have to re-encode the file and possibly introduce noise artifacts into your audio file. Finally, third, these three formats were chosen because they could all be re-encoded with minimal or zero quality loss to 8000Hz mono 8-bit ulaw — the standard audio encoding format used by the U.S. public telephone system.

This leads to the final question: how does the encoder and decoder agree on the encoding format for the data? We shall discuss encapsulation next week…

Our fine engineers here at Plum have added a new question type to our survey application: the transfer question type. The transfer question type allows the survey designer to insert a phone call anywhere in a survey. This feature, when paired with skip logic, is quite powerful indeed as I’ll discuss further down in this post.

But first there were some design challenges associated with adding this feature:

  • It’s only available for the IVR version of a survey. We felt there wasn’t a good web equivalent of making a phone call and decided, rather than coming up with a weak web counterpart to call transfer that no one will use, it’d be better to simply make this a phone-only feature.
  • What is the “result” of a transfer question? For this iteration of the Plum survey application, a transfer question will return the length of the call. If there’s sufficient interest, we’ve considered returning a recording of the call transfer as the result similar to how a recording is returned for the comment question type.
  • If the caller hangs up during a transfer, they might miss out on questions that occur afterwards. However it’s natural to hang up during a call transfer if it’s the last question in a survey. We decided that in the former case, data will not be saved for this respondent just like if they gave up in the middle of a web survey without finishing it. In the latter case, we’ll return the length of the call up to the point where they hung up and if, indeed, the call transfer was the last question in the survey, the survey is considered completed and the data is saved in the database.

Of course, these design choices are fairly minor matters. The ability to transfer a phone-based survey taker to any phone number based on choices they previously made opens up numerous possibilities for using the Plum survey application as both an enhanced survey tool and a general IVR tool.

First I’ll offer an example of how one coule use the transfer question type to enhance an existing survey. Let’s say you’re a call center that wants to ask your customers how satisfied they were with the rep that they reached. Frequently this determination of whether to take a survey occurs at the end of the phone call. This leaves open the possibility that the rep could game the system by only mentioning the satisfaction survey to callers with whom they’ve had a good call.

With the Plum survey application, you could ask the caller if they want to take a survey before they speak with a rep. It would look something like this:

  1. Ask caller if they want to take a survey after the call. If yes, go to step 2. If no, go to step 3.
  2. Transfer the call to a rep. After the rep hangs up, go to step 4.
  3. Transfer the call to a rep. After the rep hangs up, end the survey.
  4. Proceed with asking the caller some questions about the conversation they just had with a rep. Once the caller has answered all of the questions, end the survey.

Thus, the survey is no longer just a call destination after you’re done talking to a rep. The survey application becomes the entire call, from the first question, through the conversation with the rep, to the satisfaction questions themselves.

Second I’ll offer an example of using the Plum survey application for as an IVR autoattendant/call director. You can think of an autoattendant as a series of questions that the IVR asks a caller to figure out where to transfer their call. So even though the Plum survey application isn’t explicitly intended to be used as an autoattendant, it can now certainly be used in that manner now that a call transfer is just another question type.

Imagine the following autoattendant structure:

  • Choose a language: English or Spanish
    • If English, choose sales, billing, or technical support in English
      • If sales, transfer them to the English sales line
      • If billing, transfer them to the English billing line
      • If support, transfer them to the English support line
    • If Spanish, choose sales, billing, or technical support in Spanish
      • If sales, transfer them to the Spanish sales line
      • If billing, transfer them to the Spanish billing line
      • If support, transfer them to the Spanish support line

There are six different possible phone numbers to which to direct the caller. The “survey” would end up looking something like this:

  1. Ask the caller if they want English or Spanish. If they choose English, skip to step 2. If they choose Spanish, skip to step 6.
  2. In English, ask if they want sales, billing, or technical support. Skip to step 3 if sales, step 4 if billing, or step 5 if support.
  3. Transfer call to English sales. Once the conversation is over, end the survey.
  4. Transfer call to English billing. Once the conversation is over, end the survey.
  5. Transfer call to English support. Once the conversation is over, end the survey.
  6. In Spanish, ask if they want sales, billing, or technical support. Skip to step 3 if sales, step 4 if billing, or step 5 if support.
  7. Transfer call to Spanish, sales. Once the conversation is over, end the survey.
  8. Transfer call to Spanish, billing. Once the conversation is over, end the survey.
  9. Transfer call to Spanish, support. Once the conversation is over, end the survey.

Thus, by adding the transfer question type, the Plum survey application is now a fairly general tool. Yes, there are a capabilities missing that would make it potentially a completely generalized tool: stateful control-flow logic, large user-defined grammars, and direct data integration to name a few. And, no, not all of them will be built into the Plum survey application. But even as-is, most users should be able to design and build many simple non-integrated applications quickly and cost-effectively using a tool that relies on a simple survey paradigm.

We’ve got more features on the way, so stay tuned.