I have a project that involves real time animatronics. As such I wish to have the application running on a small self-contained (no net connection) system that will listen to a microphone, and then extract a stream of phonemes from what is said.
These phonemes can then be user to articulate servo's to give an approximation of the lip and tongue movements
The key is approximate - I am not looking at providing lip-reading capability. However more that the lips of say a werewolf or orc will purse, widen, and narrow in response to the actor's speech.
My target platform is an RPi 3 or similar running headless except for the microphone and GPIO output
Let say you have a werewolf in a movie. It can talk, growl, snarl, move it's ears and generate other expressions. What you don;t see if the 5 or 6 people off to one side who are the puppeteers that control all of the servos to make the creature 'act'.
This is great however if you want to take the same creature and place them in a convention or Cos-Play environment it is unpractical to have the puppeteers trying to handle unscripted situations.
I can use the actor/wearer's face to pick up various movements like the eyebrows etc. However if the suit can also pick up the actor's speech and approximate enunciation then that would be brilliant
Many thanks for any advice etc.