ChatGPT overreliance: Why 28%→22% Drop Signals a Critical Risk

ChatGPT overreliance is easy to miss: as chatbots draft emails, analyses, and code, humans stop interrogating outputs and start outsourcing thought. The Washington Post warns that “automation bias” nudges users to accept AI answers without checking, quietly weakening judgment over time [4]. NIST’s July 26, 2024 generative‑AI profile calls these systems “high‑impact,” urging human oversight and testing to prevent performance drift and misuse [1]. And the White House’s May 16, 2024 worker guidance stresses transparency, upskilling, and keeping decision authority with people—not machines [3].

Key Takeaways

– Shows clinicians’ unassisted detection fell from 28% to 22% after routine AI exposure, a 6‑point (≈21%) relative decline in performance [5].
– Reveals regulators issued 45‑day response windows with January 2024 FTC 6(b) orders to probe AI investments, partnerships, and input market concentration [2].
– Demonstrates NIST’s 2024‑07‑26 generative‑AI profile classifies systems as high‑impact, urging lifecycle testing, transparency, and oversight to monitor performance drift and misuse [1].
– Indicates the White House’s 2024‑05‑16 worker guidance centers empowerment, requires human oversight and transparency, and prioritizes upskilling to keep decision authority with people [3].
– Suggests 2025‑06‑03 guidance: counter automation bias with “distrust‑and‑verify” habits and design nudges that force engagement before accepting AI outputs [4].

What “ChatGPT overreliance” Looks Like at Work

In practice, ChatGPT overreliance starts with convenience. A chatbot drafts the memo faster, the code assistant proposes a fix, and the meeting summary auto‑populates. The problem is not the assist—it’s the surrender of scrutiny and situational awareness. The Washington Post documents how automation bias leads users to trust screens over senses, especially under time pressure and workload, and recommends a “distrust‑and‑verify” routine to break the habit [4].

Those verification steps can be simple: ask for sources, test the claim on a small sample, or reframe the question and compare outputs. The stakes rise with domain criticality. When AI answers are taken at face value in medicine, finance, or HR, errors propagate into real‑world harm. The Post emphasizes design that compels engagement—forcing users to review evidence or explain acceptance decisions—to counter the default of blind reliance [4].

The Clinical Warning: A 28% to 22% Drop After AI Exposure

Evidence of “deskilling” is emerging. In mid‑2025 coverage, Time reported on a Lancet Gastroenterology study subset (from the ACCEPT trial) in which clinicians’ unassisted adenoma detection rate fell from roughly 28% to about 22% after they had become accustomed to AI assistance in colonoscopy [5]. That is a 6‑percentage‑point decline—about a 21% relative drop—when clinicians subsequently worked without AI. The finding signals a measurable erosion of skill associated with routine AI exposure [5].

Context matters. The study does not condemn AI in endoscopy overall; assistive systems can lift average detection rates when used properly. But the unassisted decline indicates a risk: if people stop practicing core competencies, their baseline degrades. That dynamic generalizes beyond medicine. In any high‑stakes domain, a model that “usually” helps can still quietly reduce the human’s ability to detect edge cases or model failures, precisely when independent judgment is needed [5].

The Policy Sweep: NIST, White House, FTC Move to Reinforce Humans

Standards bodies and regulators are moving. On July 26, 2024, NIST released a generative‑AI profile that augments its AI Risk Management Framework, labeling generative systems as high‑impact, requiring governance, explainability, and monitoring for performance drift, adversarial manipulation, and misuse [1]. The profile’s through‑line is explicit: maintain human control with oversight at every lifecycle stage and test not just for accuracy but for overreliance risks [1].

Worker policy aligns. The White House’s May 16, 2024 fact sheet urges “human‑in‑the‑loop” design, transparency about AI use at work, and upskilling so employees can evaluate, challenge, and override AI suggestions. It centers worker empowerment and labor protections, positioning AI as an assistive tool—not an authority—in the workplace [3]. That framing directly targets the human factors behind ChatGPT overreliance, prioritizing decision rights and training that preserve critical thinking [3].

Competition and governance matter too. In January 2024, the FTC issued Section 6(b) orders to Alphabet, Amazon, Anthropic, Microsoft, and OpenAI, giving 45 days to respond about generative‑AI investments and partnerships. The agency’s focus on inputs, partnerships, and competitive impacts signals scrutiny of AI supply chains and market power—pressures that can shape how aggressively AI is pushed into workflows and how much latitude users have to question it [2].

Designing Out Overreliance: Metrics and Playbooks

The design problem is measurable. Organizations should track “acceptance without verification” rates: the share of AI suggestions accepted without evidence review or testing. NIST’s profile emphasizes ongoing testing and transparency; extending that to user behavior means instrumenting interfaces to prompt verification and logging whether users review sources, run checks, or escalate to a human [1]. The goal is to make the “distrust‑and‑verify” step a default, not an exception [4].

Key metrics can include:
– False acceptance rate: proportion of incorrect AI outputs accepted without challenge.
– Second‑look rate: percent of AI outputs subjected to an independent check.
– Overrule ratio: share of AI suggestions overturned by human reviewers.
– Drift alerts: frequency of performance drift events flagged in production.

Tie these metrics to thresholds that trigger workflow changes—require second reviews above certain risk levels, escalate to domain experts, or pause deployments pending retraining. NIST’s guidance on explainability and governance supports such guardrails; explainable outputs, provenance metadata, and adversarial testing reduce the credibility halo that fuels overreliance in the first place [1].

Policies to Counter ChatGPT overreliance

Policy can harden these practices. First, mandate disclosure: users should be told when AI aids a decision, what model is in play, and what known failure modes exist. That mirrors worker transparency principles outlined by the White House and keeps consent informed rather than implicit [3]. Second, require human decision authority in safety‑critical contexts and keep auditable logs of overrides and acceptance rationales, in line with NIST’s emphasis on oversight [1].

Third, separate training and testing: ensure people regularly perform tasks without AI to maintain baseline competence. The clinical evidence that unassisted detection fell from 28% to 22% after routine AI exposure makes the case for periodic “AI‑off” drills and skill checks to prevent decay [5]. Fourth, align incentives. If speed bonuses reward blind acceptance of AI outputs, overreliance is rational. Instead, tie performance rewards to accuracy, calibration, and appropriate skepticism, echoing the “distrust‑and‑verify” ethos highlighted in June 2025 reporting [4].

Finally, governance should consider market structure. The FTC’s 45‑day 6(b) orders reflect a view that concentration in AI inputs and partnerships can reshape power over deployment norms. Procurement and compliance teams should evaluate whether vendor designs actively mitigate overreliance—forced evidence views, built‑in verification workflows—or simply maximize throughput at the expense of judgment [2].

Risk Scenarios: Where Overreliance Hurts Fast

– Clinical decision support: A model’s confident but wrong suggestion nudges a missed diagnosis; without enforced verification, downstream care compounds the error [5].
– Hiring and promotion: A résumé screener’s patterns encode bias; if managers rubber‑stamp rankings, transparency and appeal rights are moot [3].
– Finance and audit: Generative summaries gloss over anomalies; absent second‑look requirements, material misstatements slip through controls [1].
– Security operations: Over‑trust of automated triage can mute alerts; adversaries exploit predictable acceptance thresholds [1].
– Consumer services: Chatbots misstate policies; if agents accept suggestions uncritically, refunds, compliance, and customer trust are jeopardized [4].

Conclusion: Don’t Let the Tool Do the Thinking

The lesson is not to reject AI—it’s to refuse autopilot. The evidence of a 28% to 22% drop in unassisted detection after routine AI exposure is a quantifiable warning about skill decay [5]. Standards and policies from NIST and the White House emphasize human oversight, transparency, and training to keep people in charge [1][3]. And the FTC’s inquiry underscores that governance and market incentives shape how these tools show up in our work [2]. Use AI to amplify your judgment; don’t let it replace it.

Sources:
[1] NIST – Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile: https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=958388
[2] Federal Trade Commission – FTC Launches Inquiry into Generative AI Investments and Partnerships: https://www.ftc.gov/news-events/news/press-releases/2024/01/ftc-launches-inquiry-generative-ai-investments-partnerships
[3] The White House – Fact Sheet: Biden-Harris Administration Unveils Critical Steps to Protect Workers from Risks of Artificial Intelligence: https://www.whitehouse.gov/briefing-room/statements-releases/2024/05/16/fact-sheet-biden-harris-administration-unveils-critical-steps-to-protect-workers-from-risks-of-artificial-intelligence/
[4] The Washington Post – You are hardwired to blindly trust AI. Here’s how to fight it.: https://www.washingtonpost.com/technology/2025/06/03/dont-trust-ai-automation-bias/
[5] Time – New Study Suggests Using AI Made Doctors Less Skilled at Spotting Cancer: https://time.com/7309274/ai-lancet-study-artificial-intelligence-colonoscopy-cancer-detection-medicine-deskilling/

Image generated by DALL-E 3