Non-verbal communication is of paramount importance in person-to-person interaction, as emotions are an integral part of human beings. A sociable robot should therefore display similar abilities as a way to interact seamlessly with the user. This work proposes a model for inference of conveyed emotion in real situations where a human is talking. It is based on the analysis of instantaneous emotion by Kalman filtering and the continuous movement of the emotional state over an Emotional Surface, resulting in evaluations similar to humans in conducted tests. A simulation-optimization heuristic for system tuning is described and allows easy adaptation to various facial expression analysis applications. © 2012 The Brazilian Computer Society.