I see more and more surveillance camera systems mention that they have an option for multiple microphones. Adding sophisticated ears to the eyes (perhaps smell is next) is an obvious evolution of surveillance. If you accept the argument that a camera helps a security team expand their presence, more data is useful to them to interpret a situation that they see. Parents with baby monitors might be the leading market for this technology. Prison and ship IP-based intercoms also come to mind. Perhaps I should not talk about parents and prison guards in the same paragraph…
Two people standing and yelling at each other on camera could look like just two people standing; yelling is an audio data point so adding audio allows a human responder to capture better detail and pick up on urgency and relevancy. Adding a voice through speakers it also gives the responder a tool to engage remotely more quickly than in person. The trigger mechanism of the audio is also evolving. Systems already attempt to trigger an alert on tones of anger or fear. I haven’t seen a dictionary-based trigger yet, but it’s probably available.
Of course, expanding the amount of data collected raises the question of security management to protect privacy. Use of the trigger/alert system can reduce some concern about privacy by removing the need for the system to record or expose all data. However, that does not mean you can trust that secure procedures will be used, as illustrated in a Zenitel video about an IP-based system. Why is “unsecure” even an option?