Northern Group: Voice Recognition

April 30, 1998

Around 15 members attended the Northern Group’s meeting onvoice recognition held in Manchester on 25 March 1998. The speakers were JohnKay of Molesworths and Nick Millington of JM Computing.

John Kay spoke first about the user’s perspective. He said that he had beenusing voice recognition software for around a year. As a result of hisexperience, he thought that it would not do away with secretaries, as users haveto learn to use the software and the equipment they are running it on. However,it could make a great increase in productivity. He felt that he could addinformation and comments to documents far faster than by typing. One caveatwas that one had to train oneself to read what one had dictated. The systemdidn’t have a secretary’s intelligence, and was capable of producing resultswhich looked beautiful but were in fact rubbish. He also recommended thatfee-earners should have someone to take them through the learning process.

Nick Millington then spoke. He told us that JM Computing is the largestindependent reseller in the North West. They sell into several markets, but thelargest market they supply is the legal one. They supply case management andaccounts software. They have been selling voice systems for about three years,and find that 90% of the voice systems they supply are to lawyers. They havesupplied firms of all sizes from sole practitioners to offices of large nationalfirms.

Nick started by giving a flavour and history of voice products. He said thatby and large they do two things. They take dictation, and they give command andcontrol over the computer. The split between these two functions is one of thethings that varies between products.

Commercial products have been available since 1993. Initially they wereexpensive DOS products. These were followed by expensive Windows packages butnow there are relatively cheap Windows packages. For example, IBM had reducedthe price of one of their products from £500 to £100. The other maindistinction is between discrete and continuous speech packages. Until the summerof 1997 only discrete packages were available. These require the user to leave adistinct gap between words. Continuous speech products were then marketed. Thesestill do not recognise what might be called normal speech as, in contrast tonormal speech, users have to pronounce every word clearly, but they no longerneed to leave gaps between their words.

There are three major players in the market: IBM, Dragon and Philips. JMComputing recommend IBM as producer of choice for the legal market. Philipssystems are aimed at very specialist applications, and Dragon systems have ahigh degree of command and control functions at, in JM Computing’s view, theexpense of dictation.

Dragon initially started by writing packages for the disabled market. Thismeans they have very good command and control functions, but solicitors reallyneed good dictation rather than command and control. Dragon software usesalgorithms which are quite processor intensive. They were the first to introducea continuous speech system called Naturally Speaking. In his view the strengthsof Dragon Systems are their excellent command and control and accuracy, buttheir disadvantages are the need to correct as you dictate and a slower learningprocess than IBM’s.

IBM developed their systems from US military technology. The military havevoice recognition systems to allow fighter pilots to control their weapons. TheIBM system uses phonemes. JM see IBM’s strengths as being able to correctafter finishing dictation, speed and accuracy, and having a statistical voicemodel which analyses the user’s common expressions and uses this informationin recognising dictation. The learning process is also relatively fast. IBMsystem’s disadvantage is that the command and control functions are relativelypoor. There are few directly built-in, but some enthusiasts have developed alarge repertoire of command and control functions by using macros.

In their experience discrete systems can manage speeds between 60 and 120words per minute. Speed seems to depend very much on the individual user, andthe way in which they speak. Continuous systems are rated at 100 to 130 wordsper minute, although they believe that some of their users might be managingover 150.

The main current packages are Dragon Dictate and IBM Simply Speaking fordiscrete dictation, and Dragon Naturally Speaking and IBM Via Voice forcontinuous speech. There are some cheaper home packages available but he did notrecommend them for business use because they did not have the facility to usemacros. Macros are a very important part of productivity. Normally training willinclude a day analysing the user’s speech and document patterns and setting upappropriate macros. Indeed, it has reached the point where some users can draftwills simply by calling out a series of macro clauses such as ‘will 5, will16’ which automatically insert the appropriate text into the document.

Discrete systems need a 486 DX-2 machine with 24 megabytes of RAM. Continuousspeech requires a Pentium 200 MMX with 64 megabytes of RAM. As with many things,the more RAM in the computer the better.

Nick Millington concluded his talk by speculating on where the technologymight go. There were already digital dictaphones on the market. It was in theorypossible to record something on these and take it back to the office for thevoice recognition system to type it up. However, this was not something that JMComputing had yet attempted. As technology improved presumably packages would beable to recognise true continuous speech. One further idea that had beensuggested to him recently was that it might be possible to use eye control ( atechnology that the military already have) to control the computer, so that itwould be possible to eliminate both keyboard and mouse. He than gave us ademonstration of Via Voice in practice, showing how it was able to recognisewords in context and could produce large quantities of standard text frommacros. This was followed by a question and answer session.

Our thanks are due to John Kay and Nick Millington, and to Eversheds forarranging the venue.