DWP AI: what do we know?

Yaning Wu / Better Images of AI / prediction / CC-BY 4.0

There were a couple of stories in the Guardian last week about DWP’s use of AI: one on the fact that they have halted a couple of proof-of-concept trials, and one about their Whitemail system which scans incoming post from claimants.

There is very little public information about DWP AI, so any stories like this are very interesting, not least because they only came about as a result of Freedom of Information requests.

The proof-of-concept (PoC) trials were called A-cubed and Aigent: A-cubed was meant to quickly summarise regulation and policy information to help Work Coaches advise jobseekers, and Aigent was a tool to help speed up PIP decision-making. Whitemail, which is still in use, scans incoming post from benefit claimants and is meant to be able to pick out indications of ‘vulnerable’ people from their letters, triggering an escalation to a human who reviews the letter and decides what, if any, action is needed to support the individual.

I put in my own FoIs last year about these same tools and have a couple more nuggets of information about them which were not in the Guardian articles.

A-cubed change of scope: When I first read about the existence of A-cubed it was not only meant to summarise complex official documentation but also to use correspondence between the department and benefit claimants to aid Work Coaches. The responses I got to my FoIs indicate that this second aspect was not incorporated in the PoC, and that DWP staff were instructed not to enter any Personally Identifiable Information into A-cubed. They didn’t provide any information about why the scope of A-cubed changed.

Its use was limited to a few ‘test sites’ to test its feasibility, and some of the lessons learned from the PoC are being used to inform the creation of a new AI tool, DWP Ask, which is currently in development. I’ve asked for more information about what was learned and what Ask is going to do.

Whitemail training: Whitemail (named after the fact that letters sent to the department from individuals as opposed to companies and other departments tend to arrive in white rather than brown envelopes) is used to quickly scan incoming post to identify vulnerable people. It uses natural language processing and large language models to identify potentially vulnerable people from their correspondence, which is then reviewed by a human to determine if any further action is needed.

So far so good? Quickly identifying people who need help and ensuring they receive it is a valuable aim. However it’s not really clear to me how it works. The FoI response I received states that Whitemail is a ‘pre-trained’ model. Pre-trained by whom, using what data? I believe the supplier of Whitemail to be Agilysis, although weirdly when you google ‘whitemail agilysis’ some results that I found previously no longer appear. The only mention is this short post from 2015 (?!)

I’ve submitted a follow-up FoI to find out more about Whitemail actually works. Does the Department know how the model was trained? Are they confident that it really does pick up all possible vulnerable people?

And as an aside, what about that date? Has Whitemail really been in use since 2015? According to my FoI response it was introduced in 2023. More digging required?

Aigent training with health data: On Aigent, I got very little back. Probably of most interest is that it was trained on ‘redacted data previously used in Health’, and that no Personally Identifiable Information was used to train the model. I’d love to speak to a data privacy specialist to unpack this.

Overall it’s positive that the Department are halting the use of AI tools when they do not perform as desired, and that lessons are being learned as they build new tools. However, the fact that we only know this through Freedom of Information requests is less positive. Building trust in these tools is essential if the public is ever going to have confidence in them (let’s put the question of whether AI should be used at all in life-changing decisions by the welfare state aside for now).

Departments should be much more proactive about discussing what they are working on, what they are learning along the way and how that’s improving their products. It’s a big drain on everyone’s time to wheedle this information out by FoIs, for them as much as for me!

Anna Dent