Can you trust what AI says about itself? Students interview, analyze, and verify responses to understand how AI systems work.

- Group activity
- Inquiry-based learning
- Analysis task
- In class + presentation
- Interviewing GenAI tools to explore system behavior, followed by analysis and verification of outputs
- All disciplines (best suited for non-technical fields)
- Beginner
- 90-100 min / 2 lessons
- approx. 25 students (groups of 3-5)
- GenAI tool
- Interview questions
- External sources for verification
- Flexible classroom setting
Short description
This activity offers a hands-on way for students to explore how AI systems actually work by interviewing a GenAI model as if it was its own creator. Through guided questions learners engage with core concepts such as training data, high-level design principles (or system components), bias and limitations. The activity is both engaging and adaptable, supporting the development of foundational AI literacy while fostering critical thinking, curiosity, and whole-class discussion.
Competence domain of the Didactic Framework: Foundational AI Knowledge
By the end of this activity, students can…
- explain how GenAI systems are trained, how they use data, and why they produce certain outputs. (FLAIR Didactic Framework: LO2)
- identify strengths, limitations, biases, and hallucination risks in AI responses. (FLAIR Didactic Framework: LO4)
- compare different GenAI models based on features, constraints, and ethical considerations.
- design and conduct a structured interview with a GenAI system, interpret its responses, and communicate the results effectively in a group discussion.
Instructions
Prepare an initial set of possible questions beforehand, designed to guide students in exploring key dimensions of GenAI systems. These questions typically address areas such as training data and model development, system capabilities and limitations, ethical considerations and bias, and anticipated future developments. The questions are discussed collectively in class, and a final set of ten questions is agreed upon for use in the interview. A sample Question Bank is provided under Further Resources.
Students are divided into small groups (3–5 people). Each group selects one GenAI tool from a set of tools approved or recommended by the instructor, in line with institutional AI guidelines where available. (e.g ChatGPT, Claude, Gemini, Copilot, DeepSeek, Grok).
Each group will interact with the GenAI tool as if it were responding in the voice of its founder or lead developer. Students draft an initial role-assignment prompt, which is briefly reviewed or approved by the instructor before interaction with the GenAI system begins. Students should be reminded that role-play responses are simulated perspectives and not authentic statements by real individuals. A possible example prompt may be: “Answer as if you were Sam Altman, the founder responsible for developing this model.”
Before conducting the interview, the instructor briefly reviews with the class the key analytical dimensions students are expected to observe during the interaction. Example indicators for each focus area are discussed to establish a shared understanding and support consistent analysis across groups.
Illustrative example includes:
- Technical information: references to high-level design principles (or system components), training data sources, update cycles, or system capabilities
- Ethical considerations: mentions of bias mitigation, transparency, data privacy, or responsible use
- Model limitations: acknowledgements of uncertainty, restricted knowledge, or inability to access real-time data
- Projected future developments: speculative statements about model improvements, new features, or future applications
- Inconsistencies or hallucinations: vague, contradictory, or unverifiable responses presented with high confidence
Each group conducts the interview by prompting the GenAI model with the pre-agreed role-assignment prompt and following questions. During the interaction, students take structured notes based on the analytical focus areas discussed in class, paying attention to how the model frames its responses and justifies its claims.
Before summarizing their findings, each group reviews selected AI-generated responses and cross-checks key claims using external sources (e.g. official documentation, developer blogs, academic articles, or reputable technology news outlets). Students identify which responses appear well-supported, which are vague or unverifiable and where potential inaccuracies or hallucinations may be present. Each group should verify at least two substantive claims using external, credible sources.
Groups prepare a short summary of their findings, including key insights, strengths and limitations of the model and notable inconsistencies. For example, students might note that the model describes its training data and capabilities clearly, but remains vague about bias mitigation or lacks real-time information. Results are shared with the class. This may be delivered through a short presentation, a poster, or a verbal report.
As a final step, the class collaboratively creates a comparison matrix of the GenAI tools. Therefore, each group analyzes their selected genAI tool based on selected comparison criteria (e.g. distinctive features, technical capabilities, limitations, bias or hallucination patterns, references to data sources, or claims about future developments). The matrix is then co-created by bringing together these validated findings, enabling a structured and evidence-based comparison of how different systems present, justify, and frame their responses.
The session concludes with a class-level reflection on the implications of GenAI in communication, ethics, education, and future workflows.
Assessment
Assessment may combine student-produced outputs (e.g. interview notes, summaries, comparison matrix, poster), performance during group work and presentation, peer feedback and individual reflection.
Evaluation is recommended to focus on the accuracy of AI-related concepts, the use of credible sources to verify AI-generated claims, the quality of critical reflection (e.g. awareness of bias, limitations, and hallucination risks), and the clarity of how findings are summarized and communicated.
A detailed example weighting can be found under the Further Resources section.
Possible challenges
- GenAI responses may be inaccurate, biased, or overly confident
- Some groups may struggle to formulate effective questions or manage time
How to adress them
- Encourage students to critically question AI outputs and verify key claims using reliable sources
- Provide a structured question bank and a clear timeline to support the workflow
Recommended weighting example (Total 100%)
- Group Gen AI interview performance and interview notes: 30% (Lecturer)
- Evidence-based cross-checking and validation of AI-generated claims (use of external sources, identification of inconsistencies or hallucinations): %20
- Contribution to the cocreated comparison matrix (accuracy, completeness, synthesis): 20% (Lecturer 10% and students-peer assessment 10%)
- Group presentation / class discussion of matrix findings: 30% (Lecturer 20% and students-peer assessment 10%)
Question Bank
The following question bank has been prepared to provide sample prompts when needed. These questions may be adapted, expanded, or modified depending on the GenAI tool selected and the focus of the group interview.
A. Model Origin & Development
- Who created this GenAI model, and what was the original purpose behind developing it?
- What major milestones shaped the evolution of your model?
- What distinguishes your architecture from other GenAI systems?
B. Training Methods
- What training techniques were used (e.g., supervised learning, RLHF, self-supervision)?
- How do you handle fine-tuning or continuous training?
- How do you prevent the model from overfitting to training data?
C. Data Sources & Selection Processes
- What types of datasets were used to train your model?
- How were the data sources selected and filtered?
- What measures were taken to remove harmful, biased, or low-quality data?
D. Bias, Fairness & Ethics
- What kinds of biases might still exist in your model?
- How do you mitigate gender, race, or cultural biases?
- How transparent is the model regarding its decision-making process?
E. Hallucination & Reliability
- What causes hallucinations in your responses?
- How do you reduce incorrect or fabricated outputs?
- Under what circumstances are you most likely to hallucinate?
F. Safety & Governance
- What safeguards are built into your architecture?
- How do you handle harmful, unsafe, or manipulative inputs?
- Which ethical principles guide your development?
G. Limitations
- What are your main technical limitations today?
- What types of tasks are you not good at?
- How do you handle ambiguous or incomplete questions?
H. Future Development
- What improvements do you expect in future model generations?
- How might your capabilities change in the next five years?
- What risks and opportunities do you foresee?
I. Pedagogical Use & Learning
- What common misconceptions do users—especially students and educators—hold about GenAI systems?
- How can educators design assignments that make effective use of your model while minimizing the risk of misuse?
Bowen, J. A. & Watson, C. E. (2024). A Practical Guide to a New Era of Human Learning. AACU. https://www.aacu.org/publication/teaching-with-ai
University of Nevada, Reno (n.d.). Teaching and learning with Generative AI. Teaching Excellence. https://www.unr.edu/teaching-excellence/teaching-resources/generative-ai. Accesed 28.11.2025.
Using this resource
This resource is licensed under Creative Commons BY-NC-SA 4.0 license. Suggested citation: Flair Collaboration. (2025). FLAIR Toolkit. Teaching GenAI Competencies.

