Adding SpeechRecognition to your OS X App

Mac OS X has long had (rumor has it in the dark days of pre-X) excellent speech recognition and speech output capability. There is no training involved, it just works. The OS supports it, many apps support it. It's great.

To enable speech, go to the Speech preference pane (you know the drill, -, Spe, ) and give it a drive. Great commands like "What time is it?", "Tell me a joke", or "switch to finder". Ok, so those aren't great. People really just think you're weird for shouting at your computer.

What I'm here to tell you to day, avid reader (yes singular, "Hi!") is that it is incredibly easy to add this capability to your own Cocoa applications.

Apple's Developer Docs: Speech presents that basics, but leaves out a couple of important points.

1) Insert an NSSpeechRecognizer in your apps/class' interface definition (hint: it's the .h file)


NSSpeechRecognizer* recog;

2) Intialize the recognizer and specify the commands your app is looking for:


- (id)init {
self = [super init];
if (self) {
NSArray *cmds = [NSArray arrayWithObjects:@"Forward",
@"Stop", @"Left", @"Right", @"Backwards", @"Roll over", nil];
recog = [[NSSpeechRecognizer alloc] init]; // recog is an ivar
[recog setCommands:cmds];
[recog setDelegate:self];
}
return self;
}

In this example I used the great commands, "Forward", "Stop", and "Roll over". ;)

3) Add the speechRecognizer: method which is what gets called by the system speech recognizer trying to figure out if your app knows what to do with the user's jabbering:


- (void)speechRecognizer:(NSSpeechRecognizer *)sender
didRecognizeCommand:(id)aCmd {

if ([(NSString *)aCmd isEqualToString:@"Forward"]) {
NSLog(@"Forward called");
return;
}

else if ([(NSString *)aCmd isEqualToString:@"Stop"]) {
NSLog(@"Stop called");
return;
}
else if ([(NSString *)aCmd isEqualToString:@"Roll over"]) {
NSLog(@"Rollover called");
// .... some response here...
}
}

4) Add startup and cleanup code:
awakeFromNib:


[recog startListening];

awakeFromNib:


[recog dealloc];

Now run your app, open the "Speech Command Window" and your app should be in there. Start speaking and your commands will get called (NSLog is a good debugger for this).

About this article

written on
posted in ProgrammingAppleCocoa Back to Top

About the Author

Andrew Turner is an advocate of open standards and open data. He is actively involved in many organizations developing and supporting open standards, including OpenStreetMap, Open Geospatial Consortium, Open Web Foundation, OSGeo, and the World Wide Web Consortium. He co-founded CrisisCommons, a community of volunteers that, in coordination with government agencies and disaster response groups, build technology tools to help people in need during and after a crisis such as an earthquake, tsunami, tornado, hurricane, flood, or wildfire.