Speech recognition for UNIX

This seems to be the day for open source speech recognition announcements. “CMU Sphinx (the speech recognition software being developed at CMU being funded by DARPA and NSF for the last 15 years) has gone open source and is up for download on SourceForge. You can check out the announcement, go to the homepage at CMU, or download the code for yourself. It should build out-of-box on several platforms, linux, freebsd, sun4m, etc – but work is still needed. Help with documentation would be greatly appreciated, too. It’s important that people grab this stuff ASAP, too, just in case some people decide to go after it for potential patent violations (we all know how much people love the patent system).” The main project Web page: http://cmusphinx.sourceforge.net/ You can download it at:
http://sourceforge.net/projects/cmusphinx/files/sphinx2/0.1.0a/sphinx2-0.1a.tar.gz/download

Comment 1: I just received this today, which may be of help for those looking for speech recognition software on UNIX boxes (anybody knows of any other software for this platform?) I have not been successful to install it yet, so I cannot comment on particulars, but I will let you know when I manage to try it.

Comment 2: I was going to say, Dragon Dictate is being ported, but now I look a bit closer, it looks as if esr’s message might have been originally posted to the ddlinux mailing list? I wonder if anyone is doing an rpm.

Comment 3: If you’re talking about sphinx2, there’s a .spec file in the distro – I should be able to bundle up an .rpm if there’s not one already (fingers willing of course).

Comment 4: I’ve just joined the list so I didn’t get the original message. Our department may be interested in speech recognition for linux. Please could someone send me the original mailing?

Comment 5: Sorry. The rpm program (RedHat Package Manager) is an installation manager. A .rpm file is a package of everything needed to install a piece of software correctly — apache-1.3.6-7.i386.rpm, for instance. It’s a great boon because it checks dependencies and libraries and so on and won’t install if it finds anything that will stop the software from working correctly.

Comment 6: Well, I meant sphinx, at any rate. What’s different about sphinx2? If there were an .rpm, it might be possible to put it on the website so that people could download it. 

Comment 7: That is, I believe, the name of the current source:
http://download.sourceforge.net/cmusphinx/sphinx2-0.1a.tar.gz If I manage the RPM build tonight I’ll upload it to the usual suspects, and also make it avail on one of my sites.

Comment 8: A personal review would be more useful.

Comment 9: A review would be interesting, yes, but I question whether it would be more useful than a .rpm. Given that the software is free, if there’s a package available it’s very easy for users to install it and find out for themselves whether the software does what they want it to do.

Comment 10: I think most users would like a review before they went to the bother of installing it, especially if they have got RSI and don’t wish to perform unnecessary tasks on the computer. It looks like I’ve stepped in to a den of unix nuts. I am hoping to see more discussion about recovery strategies For people with RSI!

Comment 11: Does anyone know if any of the major speech recognition software manufacturers have released their products commercially for any Unix type platform? I’m thinking of Dragon, IBM, Philips and L&H. My body isn’t up to a survey around these people’s Web sites today, so I thought I’d ask some people who might know! 

Comment 12: Not that I’m aware of. Dragon have allegedly been porting their stuff for quite some time now, and IBM have released libraries for Linux – but not apps, so it’s up to the Linux community to do something useful with them. Even vocabulary building has to be done on a Windows machine as far as I understand it. It’s a start, but a far cry from anything useful. I’m very doubtful that Linux will have useful voice stuff any time soon given the amount of integration the voice software needs with the GUI in order to control programs effectively. I’d like to be wrong though, and maybe I am.

Comment 13: I think the geeks took over *temporarily* – usually this List is about RSI. But a lot of us use voice software.

Comment 14: Ha ha! “down, geeks, down!” being slightly geeky myself I recognise the tendency to go into bamboozle-mode.

Comment 15: I suspect UNIX types are more prone to keyboard-related cumulative trauma, rather than problems from using pointing devices. As a programmer and general UNIX systems guy, almost everything I do involves typing. Yes, there’s a GUI around many UNIX OSs these days, but I rarely use it for more than opening up windows to type into. I just sit here and type all day… eventually something had to give.

Comment 16: I can see it might look that way. You happen to have joined during one of the very few times Unix has ever been mentioned. Which is curious, I think. Are Unix users less prone to RSI than users of Windows and Macs, I wonder, and if so, why? Might it have anything to do with the fact that Unix is less tightly bound up with the use of GUIs and therefore mice? I think this is an interesting question, because if the answer was yes it would suggest (to me, at least) that a lot more might be done to prevent RSI through the design of GUIs. But it could be that just as many Unix-users get RSI, but for one reason or another they don’t join this list, or if they do join they don’t happen to mention Unix-specific problems.

There is some information on the website about recovery strategies other people have used. And there does indeed tend to be a lot of discussion about these things on the list. You just happened to get here on Unix day. Coping strategies are also important, though, and the use of VR has proved to be one of the most successful coping strategies of all. So from time to time the subject of VR, on whatever platform, does crop up.

Comment 17: I’ve used UNIX for 8 years until I started a new job at the beginning of this year (I’ve had RSI for 4 years). I now use a PC running LINUX and my left elbow hurts alot more because of the extra mouse clicking I need to do each time I need activate a window (which is an awful lot). The nice thing about UNIX is that windows are activated merely by positioning the cursor over them. I hate all this extra clicking.

Comment 18: I think I’m missing something. If you’re using Linux, and therefore presumably X, can’t you configure it so that focus-follows-mouse?


Leave a Reply

Your email address will not be published. Required fields are marked *

Notify me of followup comments via e-mail.