Results (
Thai) 1:
[Copy]Copied!
The results obtained confirm the theory that the addition of thevisual modality increases the accuracy of ASR systems. It wasobserved that, while both the audio and visual recognitionsystems may fail to recognise a particular phoneme, thecombined audio-visual recognition will be more likely tosucceed. This observation is attributed to the fact the audio visualintegration scheme maximises the output probabilities of bothmodalities. It was also observed that highly confusable audiophonemes were recognised easier by the visual modality e.g. /f/and /th/ and the same applied for visemes e.g. /w/ and /r/. Thepreliminary results of our AVASR system while meeting thesynergy requirements will be improved using more training data.The implementation of the AV ASR system has allowed us togain further insight into the problem of multi-modal integration.
Being translated, please wait..
