Ethics in AI: Part 5

An infographic illustrating the feedback loops in content recommendation and credit approval systems, emphasizing the cumulative effect of small errors on structural harm.

Social Impact

Years ago — life before iTunes, before Spotify, before algorithms knew what you liked before you did — I heard the most beautiful song on the radio.

I rushed for a pen and paper. Too late. The DJ had already moved on. I had no title, no artist, no way to find it. Just the music, lodged somewhere in memory, with nowhere to go.

That song stayed with me for years.

Then one day I heard it again. I recognised it immediately — the same ache, the same sound. This time I was ready. Moments later I was on Amazon, searching for what I’d finally caught: Nick Drake. River Man. I bought the CD — Five Leaves Left — and discovered one of the most quietly extraordinary artists I have ever heard.

Nick Drake never had a hit in his lifetime. He sold a few thousand records. He died at twenty-six, largely unknown. His reputation grew slowly, entirely through human recommendation — one person telling another, a song surfacing unexpectedly on a radio programme, a stranger pointing someone in the right direction. Decades after his death, he is considered a towering influence.

I think about that story when I think about what recommendation algorithms do — and what they can’t do.

Those moments of genuine discovery are becoming rarer. And it is not an accident.


In previous parts of this series we have examined bias and fairness, privacy and consent, and transparency and explainability. Each of those topics asks what happens when AI gets something wrong in the moment — a biased decision, a privacy violation, an unexplained outcome. Social impact asks a different and harder question: what happens when AI gets things right, consistently, at scale — and the cumulative effect is still harmful?

This is the part of AI ethics that is easiest to overlook. There is no single decision to challenge. No obvious moment of failure. Just a system doing exactly what it was designed to do, and a world quietly changing around it.


The Feedback Loop

Machine learning systems influence social structures when deployed at scale. Decisions about credit approval, employment screening, content recommendation, and public resource allocation affect opportunities and outcomes for individuals and communities — not just once, but continuously, and often invisibly.

The mechanism behind many social harms is the feedback loop: a system trained on past behaviour makes decisions that shape future behaviour, which then becomes the training data for the next version of the model. Each cycle reinforces what came before. Small biases become structural ones. Initial disparities widen. And because every individual decision appears reasonable, the cumulative drift goes unnoticed until the damage is done.


Example One: The Playlist That Narrows

Consider a music streaming platform that recommends songs based on what users have previously listened to. A user starts with a few popular mainstream artists. The system, doing its job, recommends more of the same. Over time, the user is repeatedly exposed to the same genres, the same sounds, the same familiar names — while niche and emerging artists remain invisible.

The feedback loop runs like this: past listening shapes recommendations, recommendations reinforce listening patterns, and those patterns feed the next round of recommendations. Popular artists become more popular. Smaller artists remain underrepresented. Not because the algorithm intended to marginalise them — but because it was optimised for engagement, and engagement follows familiarity.

Each recommendation, taken alone, is perfectly reasonable. A user who likes one thing probably likes similar things. But the cumulative effect reshapes what people discover, what gains cultural traction, and ultimately who earns a living from their music. A technical optimisation becomes a cultural force. And nobody pressed a button that said “narrow the culture.”


Example Two: The Loan That Was Never Offered

Now consider a model used to decide who gets approved for a loan.

If the historical data that trained the model reflects decades of biased lending practices — and it often does — the model will learn to reject applicants from certain demographic groups at higher rates. It is not making a racist decision in the way a human might. It is making a statistically grounded one, based on patterns in the data. But those patterns are the residue of past discrimination.

The feedback loop here is more severe, and the stakes are higher:

  • Fewer approved loans → fewer opportunities to build credit, start businesses, or buy homes
  • Fewer opportunities → continued financial disadvantage
  • Continued disadvantage → future data that confirms the model’s original assessment

The system appears statistically accurate. It is. And it is also socially harmful. The two things are not mutually exclusive — which is what makes this so difficult to resolve by purely technical means.


Scale Changes Everything

A small systematic error, repeated at scale, produces significant societal consequences. This is the central insight of social impact analysis in AI.

A single biased loan decision is a wrong that can be appealed. A biased model making ten thousand decisions a day, over years, without review, is a structural shift in who gets access to capital. A streaming algorithm that slightly deprioritises independent artists across a platform of three hundred million users does not just affect listening habits — it shapes the economics of an entire industry.

This is why social impact analysis requires evaluating not only individual predictions but cumulative effects. It requires monitoring mechanisms capable of detecting harm early — before it becomes entrenched. It requires stakeholder engagement with affected communities, because the people most likely to identify risks are often the ones the system is making decisions about. Technical analysis, however rigorous, cannot see what it has not been designed to look for.

Machine learning systems are not neutral tools. They are components of socio-technical systems — embedded in institutions, shaped by history, and capable of reinforcing or redirecting the structures they operate within. Their evaluation must extend beyond statistical metrics to include institutional and societal considerations. That is not a soft requirement. It is an engineering one.


Asking “does the model perform well?” is no longer sufficient. The question that matters is: “What does the world look like after this model has been running for five years?”

Social impact has to be built in, not bolted on.


Next: Part 6 — Ethical Trade-offs. The honest conclusion: there are no perfect answers. Only deliberate choices.

Ethics in AI: Part 4

Transparency and Explainabilityinside the black magic box

Diagram illustrating the concept of explainability in AI ethics, featuring a blue background, large text that says 'ETHICS IN AI / PART 04,' and visual representations of explainable data, predictions, and algorithms, linked to an 'ACCOUNTABLE DECISION' node.

Ever since I can remember, I’ve wanted to know how things work. Not just that they work — but why, and what’s going on inside.

My parents had a name for it: Fiddle Fingers. Clocks, radios, household appliances — nothing was safe. I’d take them apart with genuine curiosity and varying degrees of success. The parts I couldn’t reassemble quietly disappeared under my bed. I even unscrewed the back of a plug once — purely to see electricity. The thunderbolts that shot up my arm was, in hindsight, a reasonable price for the lesson. My parents, to their credit, were more understanding than the appliances deserved.

Later I studied motor vehicle engineering — which at least meant I could finally take things apart professionally. And now, working through my MSc, I’m still doing the same thing. Still looking under the hood. Still asking what’s in the box.

Which makes the subject of this post a personal one. Because one of the most troubling things about modern AI systems is that they often don’t let you look inside. Not because the technology prevents it — but because transparency and explainability haven’t been treated as priorities. The box stays closed. And when the box is closed, accountability becomes very difficult to defend.

In the previous parts of this series we’ve looked at bias, fairness and accountability — the ethical challenges that emerge when AI systems make decisions that affect people’s lives. This instalment moves into territory that sits underneath all of those: if you can’t see how a system works, and can’t explain what it decided, the other ethical principles become very difficult to uphold.

Transparency and explainability are the mechanisms that make accountability possible. Without them, everything else is aspiration.


Two related concepts, one shared purpose

The terms transparency and explainability are often used interchangeably. They shouldn’t be — they address different things, and the distinction matters.

Transparency concerns the visibility of the system as a whole. How was it built? What data was it trained on? What modelling choices were made, and why? What does the performance data actually show? Transparency enables external scrutiny. It supports governance, auditability and regulatory oversight. Without it, independent evaluation becomes impossible — you’re simply asked to trust the outcome without any means of checking it.

Explainability concerns the individual decision. Not the system in aggregate, but this specific output: why did the model produce this result for this person, in this context, at this moment? In high-stakes settings — healthcare, criminal justice, financial services — that question isn’t academic. It’s a matter of rights.

Think of it this way. Transparency lets you audit the factory. Explainability lets you understand why one particular product came off the line the way it did.

Both matter. And in most real-world deployments of AI today, both are harder to achieve than the marketing suggests.


The three questions explainability has to answer

When we talk about making an AI system explainable, we’re really asking three distinct questions — and each requires a different kind of answer.

The first is about data. What information was used to train the model, and why was it chosen? This isn’t just a technical question. Training data encodes assumptions about the world, and those assumptions shape every output the model produces. If the data can’t be explained and justified, the decisions downstream can’t be either.

The second is about predictions. What features and weights drove this particular output? Why did the model score this applicant lower than another? Which variables carried the most influence, and in what direction? This is where post hoc explanation techniques — tools that interpret model behaviour after the fact — do most of their work.

The third is about the algorithm itself. What are the layers, the thresholds, the decision boundaries? How does the model move from input to output? For simpler models, this question has a direct answer. For more complex ones, it often doesn’t — which is where the central tension of this topic lives.


COMPAS: when a black box meets a courtroom

No case study illustrates the stakes of transparency and explainability more starkly than COMPAS — the Correctional Offender Management Profiling for Alternative Sanctions tool, widely used in the United States to assess the risk that a defendant will reoffend.

Judges used COMPAS scores to inform decisions about bail, sentencing and parole. The scores carried real weight in outcomes that determined whether people went home or went to prison. And yet the algorithm that produced those scores was proprietary. Defendants had no means of understanding how their score was calculated, no ability to identify errors in the underlying data, and no realistic way to challenge the output in court.

In 2016, ProPublica published an investigation showing that COMPAS assigned significantly higher reoffending risk scores to Black defendants than to white defendants with comparable profiles. The tool wasn’t just opaque — it was producing outcomes that were racially skewed in one of the highest-stakes contexts imaginable.

The Loomis v. Wisconsin case reached the Wisconsin Supreme Court, where the defendant argued that using a proprietary, unexplainable algorithm in sentencing violated his right to due process. The court upheld the use of the tool. The algorithm remained a black box.

COMPAS sits at the intersection of everything that matters in this conversation. Transparency was absent — no visibility into the model’s design, data or validation. Explainability was absent — no way to interrogate individual decisions. And the consequences were borne by people who had no recourse and no means of understanding why.


The tension that doesn’t go away

Here is the dilemma that transparency and explainability force us to confront — and it doesn’t have a clean resolution.

The models that tend to perform best on complex, real-world prediction tasks are also the least interpretable. Deep neural networks, gradient boosting models, large ensemble methods — these approaches can achieve superior predictive accuracy precisely because they capture subtle, non-linear relationships in data that simpler models miss. But that complexity comes at a cost: the internal workings become difficult, sometimes impossible, to explain in terms a human can meaningfully interpret.

Simpler models — linear regression, decision trees, rule-based systems — offer genuine interpretability. You can follow the logic from input to output, identify which variables matter and by how much, and explain a decision to the person it affects. But they often sacrifice accuracy to do it. In a noisy, high-dimensional real world, simpler models sometimes just get more things wrong.

This is not a technical problem waiting for a technical solution. It is a genuine ethical trade-off. In some contexts — say, a recommendation engine for a streaming service — that trade-off sits comfortably on the side of performance. In others — a credit decision, a medical diagnosis, a criminal risk score — the question of what we’re willing to sacrifice for accuracy becomes a question of values, not engineering.

Regulatory frameworks are beginning to codify where that line falls. The EU AI Act classifies high-risk AI applications and mandates transparency and explainability requirements accordingly. The GDPR enshrines a right to explanation for automated decisions. But regulation sets a floor, not a ceiling — and the honest truth is that many organisations are still well below it.


What good looks like in practice

Transparency and explainability aren’t binary. They exist on a spectrum, and the appropriate level depends on context — the stakes involved, the people affected, and the regulatory environment in play.

For high-risk applications, the baseline should include clear documentation of training data, modelling choices and performance metrics across demographic groups; post hoc explanation tools that can surface the key drivers of individual decisions; human review mechanisms for decisions that significantly affect individuals; and the ability to audit the system independently — not just internally.

For lower-risk applications, lighter-touch approaches may be proportionate. But the principle remains: the system should be able to account for itself, and the people it affects should have a meaningful way to understand and, where necessary, challenge its outputs.

The temptation to treat explainability as a presentation problem — a dashboard, a label, a percentage confidence score — should be resisted. A number on a screen is not an explanation. An explanation is something a person can interrogate, reason about and act on.


Closing thought

There is a version of AI development where transparency and explainability are treated as compliance tasks — boxes to tick, documentation to file, a report to produce before launch. That version produces systems that look accountable without being accountable.

The harder version asks the question earlier: before a model is selected, before a dataset is assembled, before a use case is approved. It treats interpretability as a design constraint, not an afterthought. It asks whether a complex model is actually necessary, or whether a simpler, more explainable one would serve the purpose well enough.

That version is also the honest version. Because when a system makes a decision that changes someone’s life — and they ask why — “the algorithm is proprietary” is not an answer any ethical organisation should be comfortable giving.

Transparency and explainability have to be built in, not bolted on.

Ethics in AI: Part 3

Infographic on Ethics in AI, illustrating different consent types (genuine, passive, implicit) and key processes including data collection, storage, processing, deployment, and retirement, with emphasis on privacy and data minimization.

Privacy and Consent

Think about the last time you clicked “I agree.”

Chances are you didn’t read what you were agreeing to. Neither did most people. And somewhere in that moment — buried in a wall of legal text nobody was realistically going to parse — a decision was made about your data. What would be collected. How it would be stored. What it would be used for. How long it would be kept. Who else might see it.

That’s passive consent. And it’s the foundation a significant proportion of AI training data is built on.

AI systems depend on personal data. That’s not a criticism — it’s a reality. The predictive power that makes modern AI useful is inseparable from the data that feeds it. But the collection, storage, and processing of personal data introduce privacy considerations that ethical design cannot treat as an afterthought. Because behind every data point is a person. And that person had a reasonable expectation about what would happen to it.

The refinery analogy applies here — but at a more fundamental level than processing. Before the question of how data is refined, there’s a more important question: did you have the right to extract it in the first place? Privacy and consent isn’t about the quality of the pipeline. It’s about the right to dig.


Consider what happened in 2015 when Google’s DeepMind division received 1.6 million patient records from the Royal Free NHS Trust. The stated purpose was to develop Streams, a clinical app designed to detect acute kidney injury. The intent was genuinely beneficial. But the 1.6 million patients whose records were transferred were never informed. They didn’t consent. Many of them had no idea their data had changed hands at all. The arrangement was later ruled unlawful by the Information Commissioner’s Office — not because the technology failed, but because the legitimacy of the data source was never established.

Good intentions are not a substitute for consent. The pipeline was clean. The extraction wasn’t.


Genuine consent requires clarity. The person providing data should understand what it will be used for, the scope of that use, and how long it will be retained. Not in principle — in practice. In language a reasonable person can understand, not language engineered to satisfy a legal requirement while obscuring the reality.

Passive or implicit consent — the pre-ticked box, the buried clause, the “by continuing to use this service” small print — undermines that legitimacy entirely. If someone wouldn’t consent knowing the full picture, then designing the consent mechanism to obscure the full picture isn’t a workaround. It’s a violation.

Data minimisation offers a practical discipline: collect only what is genuinely necessary for a defined objective. Not what might be useful one day. Not what’s technically available. What is actually required. The instinct in data-driven organisations is to collect everything and decide later what matters. Ethically, that instinct needs to be resisted.


Technical safeguards exist and they matter. Encryption protects data at rest and in transit. Federated learning allows models to be trained across distributed data sources without centralising sensitive information. Differential privacy introduces carefully calibrated noise into datasets to protect individual identities while preserving statistical utility.

But these tools come with trade-offs. Restricting information access frequently reduces predictive performance — a model trained on anonymised data may be less accurate than one trained on raw personal data. That gap is real. And navigating it honestly requires asking a genuine question about proportionality: is the predictive gain worth the privacy intrusion? Not as a rhetorical question. As a design decision, documented and accountable.


Privacy considerations don’t begin at model training and end at deployment. They run the full length of the workflow — from the moment data is collected, through every transformation and pipeline stage, through deployment, monitoring, and eventual retirement. Data retention policies and governance frameworks aren’t compliance bureaucracy. They’re the institutional memory that makes accountability possible.

And like fairness, privacy cannot be retrofitted. The decisions that matter most are made at the very beginning — what to collect, how to collect it, and whether you had the right to collect it at all.

Privacy, like fairness, has to be built in. Not bolted on.


Next: Part 4 — Transparency and Explainability.

Ethics in AI: Part 2

Infographic on 'Ethics in AI' highlighting various types of biases including historical, representation, measurement, and evaluation biases, along with the importance of designing fairness in AI systems. Features a processing node for AI and labels for input and output stages.

Bias and Fairness

Here’s the uncomfortable truth about AI bias. It rarely announces itself. There’s no error message, no warning light, no moment where the system admits it has a problem. The model just does what it was trained to do — faithfully, at scale, and often with complete confidence.

Because bias in AI isn’t primarily about algorithmic malfunction. It’s about inheritance. AI systems learn from data that reflects the world as it was, not necessarily the world as it should be. And if that data carries the fingerprints of historical inequality, underrepresentation, or flawed measurement — the model doesn’t question it. It learns it.

The refinery analogy holds here too, and it holds uncomfortably well. Contaminated feedstock doesn’t trigger an alarm at the intake valve. It moves through the system, gets processed with everything else, and comes out the other end as contaminated output — refined, packaged, and delivered with the same confidence as everything around it. The pipeline worked perfectly. That’s the problem.

So where does the contamination get in? There are four entry points worth understanding.


Historical Bias is perhaps the most insidious. It occurs when training data reflects pre-existing inequality — when past decisions were themselves discriminatory, and those decisions become the ground truth a model learns from. A hiring algorithm trained on a decade of appointment records from a male-dominated industry doesn’t invent a preference for male candidates. It learns one from the data. The discrimination was already there. The model just scaled it.

Representation Bias arises when certain groups are underrepresented in training data. A facial recognition system trained predominantly on lighter-skinned faces will perform less accurately on darker-skinned ones — not because of any deliberate design choice, but because the training set didn’t reflect the full diversity of the population it would eventually serve. Gaps in the data become gaps in performance. And those gaps tend to fall hardest on the people already least served by existing systems.

Measurement Bias is subtler still. It emerges when the features used to train a model capture some populations less accurately than others. Predictive health models built on clinical data may underperform for groups historically less likely to access formal healthcare — not because the model is poorly designed, but because the measurement itself was uneven at source. The data records what was measured. It can’t record what wasn’t.

Evaluation Bias occurs when the test data used to validate a model doesn’t reflect the diversity of the real world it will operate in. A model can pass every benchmark with flying colours and still fail badly for populations underrepresented in the evaluation set. If you don’t test for it, you won’t find it — and you won’t find it until it’s already causing harm in deployment.

Four different entry points. The same result: a model that has learned the wrong lessons, at scale, and has no idea.


Consider a real example. In 2018, Amazon built a CV screening tool to help filter job applicants. It was trained on a decade of historical hiring data — CVs submitted by candidates who had actually been appointed. The problem was that the data reflected a male-dominated industry. The model didn’t know that. It just learned what a successful candidate looked like based on the evidence it was given. Over time it began penalising CVs that included words like “women’s” — as in women’s chess club, women’s rugby team — and downgrading graduates from all-female colleges. The model wasn’t told to discriminate against women. It learned to. From data that already did. Amazon scrapped the tool in 2018. But the lesson endures: a model trained on biased history will faithfully reproduce that history, at scale, until someone intervenes.

Which brings us to fairness — and here’s where it gets genuinely hard.

Fairness sounds like a simple goal. It isn’t. Multiple formal definitions of fairness exist — statistical parity, equalised error rates, individual fairness, counterfactual fairness — and they are mathematically distinct. More importantly, they frequently conflict. Satisfying one fairness criterion can, and often does, violate another.

A model that achieves equal error rates across demographic groups may not achieve equal outcomes. A model that achieves equal outcomes may not treat similar individuals similarly. There is no single definition of fairness that satisfies all criteria simultaneously — and that means choosing one is not a technical decision. It’s a values decision.

This is the point most AI ethics conversations skirt around. Technical mitigations exist — reweighting training data, adjusting loss functions, post-processing predictions. They matter. But they cannot resolve the underlying question of what fairness actually means in a given context, for a given population, with given consequences. That question requires human judgement, institutional accountability, and explicit deliberation — not optimisation.

And without that deliberate specification, optimisation processes will do exactly what they were designed to do: maximise predictive accuracy. Not fairness. Accuracy.

Fairness has to be designed in. Because a system left to optimise on its own will not choose it.


Next: Part 3 — Privacy and Consent.

Ethics in AI: Part 1

Diagram illustrating ethical considerations in AI, focusing on processing node failures and their impact on model accuracy and bias.

The Question That Haunted Me

“But doesn’t AI get it wrong?”

It was a fair question. My answer wasn’t.

At the time I was deep in the AI hype — drinking the Kool-Aid, evangelising the technology, convinced the outputs spoke for themselves. So when someone pushed back and asked whether AI could really be trusted, I defended it. Brushed past the concern. Missed the moment entirely.

What I should have done was lean in. Agreed. Had the honest conversation about bias, about missing and unrepresentative training data, about what it actually means when AI gets it wrong at scale — not just technically wrong, but wrong in ways that shape real people’s access to resources, opportunities, and fair treatment. Wrong in ways that can ruin lives.

That question has haunted me for over three years.

Because “AI getting it wrong” isn’t one problem. It’s two very different ones, and treating them as the same is its own kind of mistake.

There are hallucinations — the model confidently generating plausible but factually wrong outputs. A technical failure. The model doesn’t know what it doesn’t know.

And there is bias — the model learning and amplifying the prejudice, inequality, and exclusion already baked into its training data. Not a malfunction. The model working exactly as designed, just on contaminated inputs. Faithfully. At scale.

The person asking that question deserved an answer that acknowledged both. Instead they got a defence of the technology.

Three years later, older and wiser, I’m finally leaning in.

AI doesn’t operate in a vacuum. It operates inside the same social structures, institutions, and power dynamics that have always shaped who gets access and who gets excluded. And because AI learns from data — data generated by humans, in a world with a complicated history — it inherits everything recorded in that data. The assumptions. The gaps. The inequalities baked in long before any algorithm touched it.

That’s why keeping AI ethical isn’t optional, and it isn’t a feature you bolt on at the end. It has to be built in by design — from the data up.

This series is about what that actually means.


Next: Part 2 — Bias and Fairness.