Constructing Evaluation with GenAI: How Knowledge Sources Shape Rubrics for Technology-Integrated Instruction

Submission 219

Presented by: Chen Britton

Chen Britton, Guy Cohen, Anat Cohen

School of Education, Tel-Aviv University, Israel

As generative artificial intelligence (GenAI) becomes embedded in educational evaluation, a key question arises: how are evaluative criteria for technology-integrated instruction constructed? This is particularly relevant when evaluating pre-service teachers' lesson plans, where assessment is shaped not only by the criteria used, but also by how those criteria are constructed and represented, influencing what counts as evidence of instructional quality. This qualitative study compares four rubric configurations: Expert-Based, Pretrained-Based, Practice-Based, and Theory-Practice-Based, each generated using a controlled prompt structure differing only in the knowledge sources provided. Findings reveal that although similar pedagogical dimensions appeared across all configurations, they differed substantially in how those dimensions were articulated and emphasized. These differences are captured through a CORE approach examining variation in evaluative Content, Organization, Representation, and Emphasis. Specifically, Pretrained-Based configurations produced broad and abstract criteria with limited contextual grounding, whereas Practice-Based configurations emphasized concrete criteria focused on observable classroom actions and instructional enactment. Theory-Practice-Based configurations connected pedagogical principles with classroom implementation, while Expert-Based configurations drew clear distinctions between instructional quality and technology use. Taken together, these configurations reflected distinct evaluative orientations toward technology-integrated instruction. The study shows that GenAI does not simply apply evaluative criteria; it actively participates in shaping how those criteria are constructed, as human decisions regarding knowledge selection shape what becomes visible and pedagogically valued within assessment. These findings point to the importance of critically examining AI-generated rubrics and ensuring active human involvement in the design of pedagogically meaningful assessment practices.