CurricuLLM LogoCurricuLLM
In the ClassroomFeaturesPricingTraining HubDevelopersFAQ
AI for schools is entering a new phase where transparency, scale, and safety have to grow together
4 November 2025

AI for schools is entering a new phase where transparency, scale, and safety have to grow together

Dan Hart

Dan Hart

CEO, Co-Founder, CurricuLLM

This week felt like a mash-up of three timelines running at once.

One is the deep research timeline, where labs are trying to make models more explainable and monitorable.

Another is the deployment timeline, where real systems are rolling out to real students at state scale.

And the third is the product timeline, where everyone is trying to turn impressive demos into reliable tools that can be used every day without breaking trust.

If you care about AI for schools, all three matter. Because the next chapter is less about "can the model do it" and more about "can we run this safely, fairly, and at scale".

Can an AI model think about its own thinking

Anthropic published research on introspection, basically whether a model can notice and describe what is happening inside its own system.

The idea is simple and big at the same time. If models could reliably report what they are doing internally, we might be able to:

  • understand behaviour better
  • debug failures faster
  • monitor risky patterns earlier
  • improve transparency for developers and users

The research tested this by injecting known ideas into a model's internal process, then asking the model if it could detect what was added.

The headline results:

  • newer models (Claude Opus 4 and 4.1) sometimes noticed the changes and described them correctly
  • performance was inconsistent, the models only succeeded part of the time
  • when asked to think about a concept, the models showed some ability to steer their internal focus

This isn't "problem solved". But it's a very practical direction. Interpretability and monitorability are not philosophical extras anymore. They're becoming core infrastructure if we want these systems to be trustworthy.

Read more: Introspection research from Anthropic

NSW EduChat rolling out to students is what real scale looks like

NSWEduChat has begun rolling out to students across NSW public schools.

Over the next two weeks, access will steadily increase until all students in Year 5 and above can use it.

This is a huge milestone. It's likely one of the largest deployments of generative AI to K-12 anywhere in the world.

And what I like about it is that the purpose is clear. It's about learning support at scale:

  • regional and metro
  • small schools and large schools
  • equitable access to safe, curriculum-aligned help

I also love that the story isn't just "AI is here". It's the human impact.

A line from a principal that's stuck with me:

"It's like the teacher's helper while you're working with other groups. It works brilliantly in small schools."

And from a Year 5 student describing their writing:

"I wasn't very good at writing at the beginning… I kept improving and kept using EduChat to help me write more."

That's the kind of outcome that matters. Quiet confidence. More practice. More feedback. Students finding their voice.

Info link: NSW Education - Student use of NSWEduChat

Karpathy's "agents are 10 years away" take is actually a useful reality check

Andrej Karpathy has been saying we're still a decade away from true AI agents.

The core point is not "agents are fake". It's that the gap between a demo and a dependable system is massive.

He lists a bunch of hard problems that still need real progress:

  • continuous learning and durable memory
  • reliable tool use and action in the world
  • safety, security, and integration outside the lab

He compares it to early self-driving. The demo looks magical, but the last 20% takes years.

I find this framing calming. It pushes us toward a more honest question:

Where can we use agentic behaviour now, with strong limits, and where do we need to be patient because reliability isn't there yet.

Read more: OpenAI co-founder says AI agents are still 10 years away

Agentic security is a good example of where "bounded agents" can shine

OpenAI announced Aardvark, an agentic security researcher powered by GPT-5, in private beta.

Security is a great use case for bounded agentic workflows because:

  • the work is high volume
  • patterns repeat
  • you can sandbox and validate
  • you can keep humans in review
  • the cost of missing issues is high

Key points mentioned:

  • continuous repo analysis, commit monitoring, and proposed patches
  • multi-stage workflow (analysis and threat modelling, scanning, sandbox validation, patching, human review)
  • integrations with GitHub and existing workflows
  • ability to surface logic flaws, incomplete fixes, and privacy issues
  • reported strong performance on "golden repositories"

The important part for me is the shape of the system. It isn't "AI, go do security". It's "AI, operate inside a controlled pipeline with checks".

That pattern is exactly what we need more of in education too.

Read more: Introducing Aardvark

FutureProof Education is the kind of system-level work we need more of

UNESCO and the European Commission launched FutureProof Education, a joint initiative to help schools prepare for the AI era.

It's bringing together authorities from Belgium, Germany, Ireland, Luxembourg, and Sweden to develop strategies and tools that make AI integration more ethical, practical, and evidence based.

Over the next two years, the project aims to:

  • create system-wide strategies for AI in teaching, learning, and school management
  • develop toolkits and professional development for teachers and leaders
  • explore validation methods so AI systems support quality learning

What stands out is the bottom-up, collaborative design. Each country tailors the work to local priorities, while contributing to shared learning.

Honestly, this feels like a model that could work well across Australian states too.

Read more: FutureProof Education - supporting schools in AI evolution

Where I'm sitting after all this

If you stitch these stories together, you get a pretty clear picture of what the next phase of AI for schools needs.

1. Scale without trust is fragile Big rollouts only work if they are safe, curriculum-aligned, and designed for equity.

2. Bounded autonomy beats magical autonomy Agents are most useful when they operate inside clear workflows with checks, not as free-ranging "do anything" bots.

3. Transparency work is not optional anymore Introspection, interpretability, monitorability… this is the plumbing we need if we want systems people can rely on.

4. System-level support matters Toolkits, professional learning, validation methods, shared standards. That's how you avoid "some schools fly, others fall behind".

The tech will keep moving either way.

The real choice is whether we build the surrounding system that makes it safe, fair, and genuinely helpful for students and teachers.

Back to all posts
CurricuLLM Logo
CurricuLLM

AI for schools

Product

FeaturesPricingDevelopersUse CasesFAQ

Company

About usPrivacy policyStatusContact

Resources

Terms of useSupportTraining hubBlog