Beyond Performance

Child seated by a table in kindergarten, wearing wristband with physiological sensors

Overview

In this project, we built a framework for understanding cognitive-affective states in Kindergarteners and presented applications for the framework in classrooms.

Cognitive–affective states during learning are related to the mental effort of the learner or the cognitive load imposed on the learner. By triangulating the data obtained from observations, physiological markers, self-reports and performance as they performed tasks of varying mental effort, we can attain deeper insights into these states than just looking at performance scores.

Role

I assisted the project’s lead, Priyashri Sridhar, in:

  • Data analysis of video recordings during the cognitive tasks using Microsoft Emotion API, cloud-based machine learning algorithm for emotion estimation from face images
  • Creating graphs and visualisations to present performance scores, physiological data, emotion estimation data and overall findings
  • Writing and editing the conference paper and journal article for the project

Process

Behavioural and Emotional Data Analysis

Background

We wanted to explain the framework through a combination of observational, physiological and performance data from two case studies (two children).

Video recordings were taken while the children were performing cognitive tasks of increasing mental effort to provide us with the observational data. These recordings were manually coded by two researchers for behaviours and facial expressions (emotions).

However, such manual methods would be difficult and time-consuming to apply for a larger group of children, such as in a classroom. Thus, we aimed to explore the possibility of deriving these emotions computationally and integrating it with the rest of the data from the physiological sensors.

Analysing Emotion from Video

From my review of existing tools for attaining emotion estimation from video, I selected Microsoft Emotion API (now part of Microsoft Face API) for its accuracy and ease-of-use. It could analyse emotions from photos and videos to give confidence scores (from 0 to 1) for the presence of eight emotions: Anger, Contempt, Disgust, Fear, Happiness, Neutral, Sadness and Surprise.

I built the application from the Windows SDK for the Emotion API using Visual Studio 2015 (C#). It was modified to log the resulting confidence scores for each emotion over time from the two case study video recordings into .json files.

The logged data from the .json files were formatted to tables in Microsoft Excel. By checking the files and the API documentation (now deprecated), I found that the Emotion service analysed emotion from video frames at an interval of 15000 Ticks. A Tick was a measurement of time defined by the API system. This was converted into Seconds by dividing Ticks over the Timescale (30000 Ticks): Seconds = Ticks/30000. Timestamps were calculated via the formula: Timestamp = Seconds/86400 and formatting the values to ‘mm:ss’ custom format. This made it easier to sync the timestamps of the emotion confidence scores with video and sensors data for further analysis. Figure 1 shows the resulting data in Excel and Figure 2 shows an example of the emotions over time plots for three cognitive tasks.

Table shows 11 columns: Column 1 shows the Ticks starting from 0 and increasing with increments of 15000 for each row. Column 2 shows the Seconds calculated from Ticks, also starts from 0. Column 3 shows the timestamp. Column 4 to 11 show confidence scores in range 0 to 1 of emotions: Anger, Contempt, Disgust, Fear, Happiness, Neutral, Sadness and Surprise.
Resulting data from emotion analysis of video recording
three emotions over time plots (from left to right). Left: Plot is for the Number Series cognitive task. Duration about 3 minutes. The Surprised emotion was the most dominant emotion for the first 2.5 minutes. Followed by about 20 seconds of happiness that peaked at score of 0.6 and 10 seconds of slight sadness that peaked at score of 0.25. The middle plot is for DCCS mixed cognitive task that is about 4 minutes. Surprise peaks at the start, at 11 seconds and 28 seconds. Happiness peak at 30 seconds (score 0.65). A mix of anger and comtempt at score 0.25 each. Majority of the rest of the video showed happiness, with peaks at 1.5 minutes to 2 minutes with scores above 0.7 and more peaks from 3.5 minutes to 4 minutes. The right plot is for Verbal Attention task and is about 4 minutes long. Neutral expression was mainly detected. Small Surprise emotion peaks at 12 seconds to 1 minute, with highest score of 0.4. From 2.5 minutes to the end, we start to see sadness peaks, with highest scores of 0.5.
Emotions over time plots. X-axis shows the confidence score out of 1 for the different emotions and Y-axis shows the corresponding timestamp.

Visualisations for Combined Analysis

To explain the framework through the combined analysis of observational, physiological and performance data, two visualisations were made (Figures 3 and 4), one for each case study.

The intention was to show the observational (emotion and facial expressions), physiological (galvanic skin response and frequency-domain heart-rate variability), and performance data over time. (Apologies for the low accessibility of the visualisations, I created alternative texts for each figure which would help to explain them further.)

We will explain key annotations from left to right of the visualisation. For the first 47 seconds and 11 trials of the cognitive tasks, the participant had correct responses to the tasks. Her GSR measures were low which meant that she was relaxed. There were two Surprise emotion peaks at 14 seconds and 40 seconds with confidence scores above 0.5. There was also happiness detected at 45 seconds. LF/HF ratios were at 2.0 towards the start, which meant she required quite high mental effort. The ratios decreased to 0.7 towards the end of trial 11. At 48 seconds to 1 minute 10 seconds and the 12th trial, she had an incorrect response. We start to see an increase in the GSR values, an  increase in LF/HF ratios and increase in the sadness confidence scores. Between 1 minute 11 seconds to 1 minutes 16 seconds, she answered trial 13 correctly and noticed a high peak of GSR. This could mean that she was excited that she answered correctly. From 1 minute 17 seconds to the end at 2 minutes 35 seconds, she answered the remaining tasks incorrectly. We observe a gradual decrease in the GSR values and increase in Sadness scores. The LF/HF ratio was 1.2 but eventually decreased to 0.3 at the end, meaning that she was putting in mental effort to answer but eventually gave up.
Visualisation for Case Study 1
For the first 2.5 minutes and trials 1 to 14, the participant answered the tasks correctly. There were high Surprise emotion peaks. GSR values peaked at trial 8, at about 59 seconds to 1 minute 13 seconds. His LF/HF ratio was 1.2 near the start and decreased to 0.5 at about 1 minute 52 seconds, meaning that he gradually used less mental effort as the trials progressed. From 2.5 minutes to the end at 3 minutes 13 seconds , he answered the rest of the trials incorrectly. He was expressing happiness at 2.5 minutes to 2 minutes 41 seconds but slowly changed to sadness and surprise. GSR peaked toward the last 10 seconds and the LF/HF ratio increased to 1.7, indicating high mental effort and possibly stress.
Visualisation for Case Study 2

In both figures, the other coloured lines represent the confidence scores for emotions. Orange line represents surprise, green line represents happiness and blue line represents sadness. At the various confidence score peaks, we annotated:

  • snapshots of the facial expressions
  • the main inferred emotion
  • heart-rate variability (HRV) values of low-frequency (LF), high frequency (HF) and LF/HF ratios - higher ratio, higher mental effort

Galvanic skin responses (GSR) are shown as black lines and measured in microSiemens. Usually, higher number of peaks and greater peak values indicate higher arousal/excitement. Performance trials of the cognitive tasks are numbered 1 through 19 or 20 at the top of the graph. The higher the trial number, the higher mental effort needed for the child to perform the task. The green and the red shaded regions represent correct and incorrect responses respectively.

Results

The framework can indeed help researchers and educators gain deeper understanding of learners’ cognitive-affective states via triangulation / combined analysis of observational, physiological and performance data.

Details of the framework, its application and our analysis of the case studies have been published in International Journal of Child-Computer Interaction (IJCCI 2019) and in IDC 2018 (refer to papers below). I highly encourage you to read the section titled: “Putting Them All Together: Triangulating Physiological, Performance and Behavioural Measures”, pages 44 to 46 of the IJCCI article and pages 259 to 261 of the IDC paper.