Contextual Research-Empirical Research-Detecting, Tracking, and Modeling Cognitive, Affective, and Metacognitive Regulatory Processes to Optimize Learning with MetaTutor

Principal Investigator: 
Project Overview
Background & Purpose: 

This 3-year REESE grant focuses on tracking the CAM self-regulatory processes that college students deploy when learning about complex and challenging science topics (in human biology) with MetaTutor. The focus of our grant is on conducting interdisciplinary research examining: (1) the temporal unfolding and alignment of CAM self-regulatory processes during complex science learning with MetaTutor; (2) the influence of experimental manipulations (non-adaptive MetaTutor versus adaptive MetaTutor) on students’ ability to regulate their learning about complex and challenging science topics; (3) the explanatory adequacy of using multi-method designs (with on-line, off-line, and learning outcomes), multi-sensing technologies, and software tools to develop a comprehensive theoretical model of the underlying CAM self-regulatory processes during complex science learning with MetaTutor; and, (4) predictive adequacy of advanced statistical methods and computational algorithms in determining quantitative and qualitative changes in knowledge and SRL processes. Our investigation of these theoretical, empirical, and educational questions will forge new directions by tracking the key CAM processes during science learning with MetaTutor.


University of Memphis, Memphis, TN and McGill University, Montreal, Quebec, Canada

Research Design: 

The research design for this project is cross-sectional and is designed to generate evidence that is causal [experimental]. The project collects original data using assessments of learning and survey research [self-completion questionnaire and structured interviewer-administered questionnaire].

This following describes the coding and scoring of the participants’ learning outcomes measures, use of self-regulated learning (SRL) processes, eye fixations and saccades, video data, and sensing data:

  • Learning outcomes. Each participant’s answers to the various learning outcome measures will be scored separately. Participants will receive an independent score for each of the pretest and posttest measures of declarative, procedural, and inferential knowledge. They will also receive pretest and posttest mental model scores. Inter-rater agreement will be established on the scoring of participants’ mental model scores (based on Azevedo and colleagues’ recent work, 2007, 2008, 2009a,b).
  • Participants’ self-regulated learning processes. Each audio file from each learner’s verbalizations during the learning task will be transcribed and analyzed based on a revised version of Azevedo and colleagues’ model of SRL (Azevedo et al., 2007, 2008, in press; Azevedo & Witherspoon, 2009). The protocols will be time-stamped for easy temporal alignment with the sensing, video, and audio data. All process data will be coded based on established conventions (see Azevedo et al., 2009; Ericsson, 2006), and a random subset (approx. 40%–50%) of the data will be re-coded. It should be noted that inter-rater agreement on the coding of think-aloud protocols will be established between the PI and trained graduate research assistants.
  • Eye-tracking data. Gaze duration, total fixation time, and regressions will be analyzed using Tobii Studio Professional software. Gaze durations refer to the first time participants look at an area of interest (AOI) until they move out of that AOI. Total fixation time refers to the total time a participant spends in an AOI; regressions refer to the number of times participants move in and out of the AOI (see Rayner, 2009). Several interface elements will be pre-selected as relevant AOIs, including the overall learning goal, sub- goals boxes, agent box, table of contents, text and diagrams, and dialogue history box.
  • Emotions from speech. Acoustic effects of an increase of activation are a widening of the F0 ranges, a raising of mean F0, and an increase in intensity Schroder et al., 1998, 2001). There is also a concomitant increase in overall speaking rate, due to both an increase in articulation rate and to fewer or shorter pauses (Trouvain & Barry (2000). We will create a dictionary of emotionally salient words (i.e., relative entropy, etc.) from a spoken utterance database that can be used for robust recognition of emotion from speech. The intonation of a word that conveys how the word has been spoken contains patterns that can be used to classify emotions.

Several interdisciplinary methods and techniques will be used to analyze the vast amounts of data that will be collected each year. The three learning outcomes measures will be analyzed using separate repeated 2×2 ANCOVAs to examine pretest-posttest gains, using the pretest scores as covariates. Several 2×3 ANOVAs to examine whether there are significant differences in the deployment of SRL processes between MetaTutor conditions. The pretest-posttest qualitative model shifts will be analyzed using logistic regressions (using MetaTutor version and coded CAM frequency class-level data as predictor variables). Eye-tracking data will be analyzed using a MANOVA. The focus of the analyses will be on the differences in eye movement across various sub-goals, and those related to specific CAM deployed during learning with MetaTutor. Dependent variables will include gaze duration, total fixation time, and number of regressions. Lastly, additional ANOVAs will be conducted on the sensing data using versions of MetaTutor as the between-subjects factor.

MetaTutor log file data. MetaTutor is designed to collect hundreds of learner and system variables with corresponding time stamps. These data will be mined to answer emerging issues such as: (1) the relation between mental model development and time spent on relevant topics, subtopics, and diagrams; (2) differences in students’ use of text and diagrammatic representations versus other types of externally constructed representations; (3) navigation path transversal clustering during learning; and (4) temporal deployment of SRL processes during learning and their relation to issues (1)–(3).

Latent growth modeling. In addition to the preceding statistical analyses, the sheer volume and nature of the collected data will permit use of latent growth modeling (LGM). LGM is a sophisticated quantitative statistical method used to analyze intra- and inter-individual differences in longitudinal data. In the current study, learning outcomes data (pre-posttest data tapping different types of knowledge [declarative, procedural, inferential]) and on-line process data (from log-files, concurrent think-aloud protocols, physiological data, and eye-tracking) will be modeled. LGM has several distinct advantages over traditional approaches to longitudinal modeling (Chan, 2002). Chief among these advantages is that LGM not only allows for simultaneous modeling of measurement relations and relations between latent variables, but also can explicitly model errors associated with both cross-sectional and longitudinal measurement.


Findings will be posted as they become available.