The project will generate evidence that is descriptive [ethnography, secondary analysis of archival data]. Original data are being collected on AI-AN STEM professionals using survey research [self-completion questionnaire, semi-structured or informal interviews], and secondary data being used include 35-year organizational archival data from AISES. Qualitative data will be analyzed inductively using modified grounded theory-based coding.
The analysis of AISES archived data will be conducted in three phases. Phase I will extract data from the AISES member demographic data and the scholarship applicant demographic data. From the AISES member data, we will identify the demographic variables that are correlated with the successful pursuit of STEM degree and identify the time trends related to age and gender. To analyze the factor related to STEM success, a logistic regression with STEM degree (1 = yes, 0 = otherwise) as the dependent variable and demographic and geographic variables as the explanatory variables.
In Phase II, we will incorporate the constructs from the initial ethnographic interviews, the resume data and science fair participant demographic data and project abstracts to expand the robustness of the Phase I analysis with factors that explain the choice of a STEM degree and duration of years required to graduate. Because there are multiple paths that a person can choose under the STEM degrees, a multinomial logistic regression model will be developed to assess the influence of independent variables on the choice of one STEM degree versus the others. To assess duration of time to graduation, a survival analysis will be conducted using: (1) a Cox analysis, (2) various mean functions (exponential, log-normal, Weibull) using a likelihood ratio to determine best fit, and (3) a mixture model to incorporate the logistic model from Phase I and conditional survival analysis using the “best-fit” hazard rate function from above.
In Phase III, we will include the structural constructs from the ethnographic interviews and the first survey. The model will include data from the data sets above and any other AISES data set that can be used to capture the structural variables that are correlated or can proxy for the constructs from the qualitative data. The models will consist of multilevel models (logistic and duration) with the additional sources of variation defined by the interviews.