iTutor and iDemo: 3D Computer Vision/Graphics with Intelligent Avatar Interaction
- Greta Mayers, Binghamton University
This project will focus on developing a new instructional methodology by utilizing an innovative computer vision based instruction system in a smart room setup with (1) a novel story-telling software to visualize course contents by automatic graphical scene generation from speech and text, and (2) a 3D avatar based virtual instructor to communicate with students through automatic affective status (emotion) analysis by a facial expression recognition technology. The project’s goal is to enhance teaching and learning in STEM courses (e.g., computer science, engineering, math, or biology) at the college level.
Visual teaching and learning – the use of graphics, images, and animations to enable and enhance teaching and learning – is one important strategy that we plan to employ. Recent developments in computer graphics, multimedia, and human computer interaction technologies have opened new opportunities for educators to engage students in science, engineering, and math. PI Yin and his graduate students are developing tools for computer-based virtual avatars [Wei04] and human behavior synthesis, analysis, and simulation [Wang06, Wang07, Sun08]. His team is also developing animation tools for graphics model visualization and simulation, which will be used in teaching STEM course (e.g., computer science, engineering, math, etc.). PI Yin has developed story-telling software [LeeYin06] that can translate voice to text, and text to graphic scenes automatically. Instructors’ and students’ ideas, stories, and presentations can be translated and visualized by a graphical movie intuitively and instantly. Such state-of-the-art software will facilitate the development of teaching and learning experience, students’ writing, presentation, and communication skills and enhances students’ creativity and critical thinking ability.
In this project, PI Yin and his group will focus on two research aspects for innovative instruction technology development:
(1)Utilize computer animation tools in classroom teaching: Many basic and abstract concepts in STEM courses (e.g., computer science, engineering, math, biology, etc.) can be explained and demonstrated intuitively through computer graphics animation software. We will apply a synthesized graphical avatar [Wei04] as a virtual instructor to interact with students, making classroom teaching and learning interesting and enjoyable.
(2) Develop a powerful and easy-to-use speech-to-text-to-graphics software for course teaching. Such a speech-to-graphics story telling software [LeeYin06] can interpret many Science concepts visually and intuitively. We will work with engineering school faculties to decide which objects to model for specific courses. A “virtual tutor” mimicking realistic facial expressions with strong interaction ability will be developed. Course materials will be read and presented by the virtual avatar, which is a synthesized human face [WeiZhu04, Smith06]. Students will carry out simple programming to change the appearance of the virtual tutor, or even replace the human face by animal face. The system will also analyze students’ facial expressions in order to understand the degree of involvements of the students in the classroom. Such analysis will be used as feedback to the instructor for adjusting the course materials and the format of the presentations. We will explain the principles of the story-telling tool and solicit feedback from students and teachers to revise and improve the algorithm and software design. Figure 1-2 illustrates the system setup and examples of 3D virtual avatar (as a virtual tutor). Here is an example working scenario:
In a regular classroom with audio-video devices, an instructor is giving a lecture (e.g., cs460 computer graphics course), and is explaining the concept of ray tracing algorithm. The algorithm is illustrated by his/her vocal instruction through the automatic translation of speech to text, then text to graphical animation to demonstrate the algorithm of ray tracing, reflection, and lighting, and intensity computation through the screen projection. This is realized by our iDEMO story-telling software system.
During the time of illustration and discussion, students’ reactions (facial expressions, hand gestures, and eye gaze, etc.) are monitored by a video camera. A virtual tutor (3D avatar as a teaching assistant) is interacting with students based on the recognized expressions of students and responding and adjusting the contents accordingly by either repeating the algorithm, or explaining the algorithm by different levels of indepth . This is realized by our iTUTOR intelligent interaction software system.
In this project, an Idea Illustration and Demonstration (iDEMO) software tool will be created that provides instant visualization of instructor’s idea, theory, illustration, demonstration, and experiments that can be used by teachers in the classroom. Such software allows a story presented by teachers and/or students to be visualized in an animated movie, thus greatly facilitating the training process for students’ writing, presentation or public speaking, and technical communications. Examples of the story telling software to be applied to the face-to-face learning and/or virtual on-line environment learning include: (a) visualizing and presenting graphic atom and molecule structures in engineering classes; (b) visualizing and animating biological data for life science classes; (c) constructing and animating motion objects for physical science and computer science class learning, and (d) explaining and illustrating geometry and mathematics in basic mathematics classes.
Through this project, we will create an Intelligent Virtual Tutor System (iTUTOR) using a synthesized graphical avatar as a “virtual instructor” to interact with students. The graphical visualization tools and user-friendly “virtual tutor” will make teaching and learning more interesting and enjoyable. The iTUTOR is smart enough to understand students’ expressions, eye gazes, hand gestures, and reactions, and adjust the presentation materials accordingly at various levels. Using the virtual instructors, we can bring these experiences to the classrooms to demonstrate new observations or discrepant events that the students will be challenged to explain. While a solution will be provided for the teachers, it will be up to each individual teacher to facilitate the discussion and incorporate the materials most effectively within their classroom environment.
Through this project, we will test our system in a number of undergraduate and graduate courses (e.g., cs460 and cs560 computer graphics in Fall 2013; cs455 and cs555 visual information processing in Spring 2014; and cs375 design and analysis of Algorithms in Fall 2013). The systems to be created and tested will be reported and submitted for publications. Besides dissemination through journal and conference publications, the PIs will set up a program website to disseminate the results of this project and a wiki to share experiences with similar programs across the country. Animation models and software generated to assist in course teaching will also be made available to research and education communities.
Each implementation stage of the project will include formative and summative evaluations coordinated by an external professional evaluator Dr. Greta Myers (see her biosketch and letter of support). Dr. Myers has a Ph.D. in Psychology and over 25 years of experience in assessing commercial enterprises (IBM, AT&T) and educational programs (K-12 and higher education). She will be responsible for creating surveys and structured interview forms, and for analyzing evaluation results. The goal of the evaluation effort is to understand the impact that our project has on both students’ and teachers’ interests in STEM learning and teaching, and the students’ progress throughout their degree program. We will conduct followup investigations to determine the impact of the intervention of students’ actual career-related outcomes. Dr. Myers will create survey instruments to assess cognitive and affective aspects of student learning and of student perception of the impact of this project.
[LeeYin06] L. Seversky and L. Yin, Real-time automatic 3D scene generation from natural language voice and text descriptions. In Proceedings of the 14th ACM international conference on Multimedia (ACM MM06), pages 61-64, 2006
[Sun08] Y. Sun and L. Yin, “Facial Expression Recognition Based on 3D Dynamic Range Model Sequences". The 10th European Conference on Computer Vision (ECCV08), October 12-18, 2008, Marseille, France.
[Smith06] J. Smith and L. Yin, "Efficient hand gesture rendering and recognition using a simple gesture library", IEEE International Conference on Multimedia and Expo. (ICME06), Toronto. June, 2006.
[Wang06] J. Wang, L. Yin, X. Wei, and Y. Sun, “3D Facial Expression Recognition Based on Primitive Surface Feature Distribution”, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2006), New York, NY, June 17-22, 2006. [Wang07] J. Wang and L. Yin, “Static Topographic Modeling for Facial Expression Recognition and Analysis”, Computer Vision and Image Understanding, Elsevier Science. Nov. 2007. p19-34.
[Wei04] X. Wei and Z. Zhu and L. Yin and Q. Ji, “Avatar mediated face tracking and lip reading for human computer interaction”, Proceedings of ACM Multimedia 2004 (SIGMM), New York, NY, Oct., 2004
[WeiZhu04] X. Wei, Z. Zhu, L. Yin and Q. Ji, “Face Animation by real time feature tracking”, ACM SIGGRAPH 2004 (Posters program) August, 2004. Los Angeles, CA.