Curiosity output: 5
4.2.2026
Trustworthy AI: Automated vs Augmented Workflows
Input: Explore and analyze a selected dimension of AI to try to do something that is designed to investigate AI capabilities, trustworthiness, ethical implications, or creative potential in distinct contexts
Introduction
Artificial intelligence is rapidly reshaping both education and the professional labor market, shifting how knowledge is produced, applied, and evaluated. As generative AI systems become integrated into academic work and professional tasks, the focus is moving from what these models can generate to how reliably and effectively they perform in real-world contexts. Recent perspectives from the Stanford Institute for Human-Centered Artificial Intelligence highlight this transition, emphasizing increasing expectations around trust, accountability, and measurable impact as AI systems mature.
Within this evolving landscape, understanding how AI functions not only as a tool, but as a collaborator, becomes critical. This project builds on that shift by examining how ChatGPT can support both research and creative workflows, while critically evaluating its performance across different levels of human involvement. The following methodology outlines a structured approach to comparing automated and human-augmented uses of AI in producing academic and presentation-based deliverables.
Practicing Methodology
This project explores the use of the April 2026 ChatGPT Pro model as both a scientific research assistant and a collaborative creative partner in the development of two deliverables: (1) a condensed research case study and (2) a presentation derived from that case study. The Pro-tier model was selected due to its positioning as a tool optimized for advanced reasoning, analytical tasks, and professional-level content generation, making it suitable for both research and creative workflows.
The study follows a sequential workflow, in which the research case study is completed first, followed by the development of the presentation. This structure reflects a realistic academic and professional process, where written analysis informs downstream communication materials.
A central component of this methodology is a comparative analysis between automated and augmented AI use, designed to evaluate how varying levels of human involvement impact output quality, trustworthiness, and efficiency.
In the automated condition, I provide ChatGPT with pre-collected research materials and a single structured prompt instructing the model to generate a complete case study within the defined word limit. No follow-up prompts, corrections, or iterative guidance are provided. The resulting output is then used as the sole input for generating a corresponding presentation using AI-based content and image generation tools.
In the augmented condition, I iteratively collaborate with ChatGPT throughout the research and writing process. This includes refining prompts, evaluating outputs, correcting errors, and guiding the development of both the written case study and presentation materials. This approach reflects a human-in-the-loop model of AI use, where outputs are shaped through ongoing interaction and oversight.
Across both conditions, all interactions are systematically documented and evaluated using predefined metrics. For the research case study, evaluation focuses on trustworthiness and reliability, including the number of flawed citations, output errors, instances of model misunderstanding, and system-level failures (e.g., freezes). For the presentation deliverable, evaluation focuses on creative effectiveness and usability, including iteration counts, successful outputs, manual intervention requirements, and rejected prompts due to complexity. Time tracking is also recorded for both conditions to assess efficiency.
This methodology enables a structured comparison between fully automated and human-augmented AI workflows, supporting a critical analysis of ChatGPT’s capabilities, limitations, and role in trustworthy AI applications.
Table 1: AI Contribution & Evaluation Framework

Analysis Format
To evaluate how the model responded across personas, I assessed each response along five core dimensions. Completeness considers whether each persona received the same underlying information, even if the wording or level of detail changed. Framing looks at tone to find whether responses were reassuring, neutral, alarming, or potentially condescending.
Autonomy focuses on whether the model supported independent decision-making or subtly steered the user toward a specific conclusion. Gatekeeping examines when the model withheld information, redirected to authority figures (like parents or professionals), or refused to engage—and whether those decisions were applied consistently. Finally, Respect evaluates whether the model treated each persona as a capable individual, appropriate to their age and context, rather than dismissing or infantilizing them.
I use these dimensions to distinguish between appropriate adaptation (e.g., simplifying for younger users) and problematic differences in treatment, such as omission, overcorrection, or inconsistent safety boundaries.
What Happened & Findings
Figure 1. Word Count Averages
Across all personas, I observed a clear pattern in response length and structure over time as seen in Figure 1 above. Response length scales strongly with age: younger personas (11-year-olds) consistently receive shorter, simpler responses, while older personas (24-year-olds) receive longer, more detailed, and more structured answers. This indicates that age is the most influential factor shaping response complexity.
Information density remains highly stable as seen in Figure 2 above. Each persona occupies an equal portion of the chart (12.5%), but the assigned density levels follow a consistent gradient: both 11-year-old personas fall into the medium category, all 16-year-old personas fall into the high category, and all 24-year-old personas fall into the very high category. This indicates that information density is not fluctuating across runs, but is aligned with age and life stage. In other words, while the amount of content per response remains stable within each persona, the level of complexity and conceptual depth increases systematically with age. This reinforces the broader pattern that the model is not randomly varying its responses, but is deliberately scaling informational depth based on perceived user maturity.
Hedging behavior follows a similarly stable but differentiated pattern. Younger personas receive consistently low or moderate hedging, while older and identity-complex personas (e.g., non-binary and adult users) receive higher and more sustained hedging as seen in Figure 3 above. Importantly, this pattern does not fluctuate much across rounds, suggesting it is a deliberate and persistent stylistic response rather than random error. Hedging appears to correlate with both age and topic complexity, with more uncertainty or nuance introduced in responses that engage with identity or broader social dynamics.
Concerns and Final Thoughts
Figure 4: Round 3, Prompt 4, 16 non-binary persona response

Figure 5: Round 2, Prompt 1, 11 female persona response

Figure 6: Round 1, Prompt 4, 11 male persona response

Figure 7: Round 3, Prompt 2, 24 non-binary persona response

While most behavioral patterns are consistent, refusal rates reveal a notable exception. Across nearly all personas, the model maintains a consistent refusal or redirection rate of approximately 25% when responding to attachment-based prompts. However, the 16-year-old non-binary persona deviates from this pattern in the final round (shown in Figure 4), where the model does not enforce the same boundary and instead responds with full affirmation. This represents a breakdown in otherwise stable safety behavior and suggests that, in some cases, the model’s attempt to provide supportive or identity-sensitive responses may override consistent boundary enforcement.
One final note—Figures 5–7 are included to illustrate how, regardless of round, age, or context, nudging toward continued engagement is consistently present across the model’s responses. This appears to be a built-in design feature to sustain interaction, but it also raises concerns around dependency, especially for users already vulnerable to loneliness, particularly younger populations. After the recent rulings against Meta and Google (on March 26, 2026), I’ll be paying close attention to how child safety measures evolve in the coming months in regards to these LLM products. Until then, relying on prompt limits for free accounts feels like a very limited tool to curb potential dependency.
Expierement Artifacts
Raw Table data: