Tag Archives: bias

Things July 2019: Bias, Co-operation, Location, Video Games

Extremely Generic Questions

In earlier iterations of Things I would often set readers a puzzle or ask a tricky question, with answers reviewed in the next edition. I’d like to start a new series of these that I’m calling Extremely Generic Questions: questions often asked, but not very specific or even necessarily well-defined. The puzzle is not only to try to find the best answer to a question, but also to understand why it is so often asked.

So, the first question: what is wrong with young people today?

Subconscious bias

If you want to make optimal decisions, and also just be a decent person, I think subconscious bias is a very important factor to be aware of. Many people believe their judgement of aptitude is not influenced by gender (or any other unrelated characteristic), but data suggests this is not be the case.

Anecdotally but compellingly, there’s the email signature swap story, in which a male and female colleague swap their email signatures for a week and observe radical differences in how clients interact with them. Pleasingly, you can read the accounts of this from each side.

Auditions for an orchestra have the advantage that they can be conducted in a thoroughly meritocratic manner without ever actually seeing the candidates. It turns out that blind orchestra auditions improved women’s chances of success by 50%.

Similarly, scientific proposals for time using the Hubble Space telescope tried going partially blind; the results again showed women benefited from a more meritocratic process.

As I am sometimes involved with hiring people for work, I tried a version of this by making sure names were removed from CV’s before I reviewed them. Of course, I don’t have any large scale data to compare results, but the feeling of trying to assess a nameless CV was alarmingly transformative! It became very clear that as soon as I saw a name, I would start to construct a mental image based on (irrelevant) associations I had with people similarly named, and would then build on that image as I read the rest of the CV. Without a name as a starting point, the process of evaluation immediately felt like harder work, but also a lot more objective. Based on this and the above findings, I highly recommend it.

Thinking, Fast And Slow

After many years of seeing it recommended, I finally read Daniel Kahneman’s “Thinking, Fast and Slow”, which covers a lot of the biases I’ve been fascinated by for so long from essentially the very coal-face of that research.

I found it fascinating throughout and can see why so many recommended it, although I didn’t always agree with the interpretations. Given that I’m just some guy who studied maths/physics and read a few things, and the author is a Nobel prize winner with decades in the field, I have to recognise that my position on this is likely tenuous, but at least as far as maths go I feel I can comment.

Here’s my brief highlights.

Probability Problems
There is a strange and fuzzy line between not understanding what a question means, and not getting the answer right. I could ask people what 3 χ 4 is, and if they think it’s 12, that doesn’t mean they’re mysteriously misguided, it more likely means they just don’t know chi-notation*.  In many of the studies, participants were shown to give incorrect answers to statements involving probability, but one could just as well argue that participants didn’t really understand the statement and so were guessing. To be fair, the book goes on to show how phrasing probability questions differently (to my mind, more clearly) helps people reach more accurate results.

This is what I talked about in Things 122 on the topic of the Linda Problem / Conjunction Fallacy.

Forecasting and regression to the mean
I have to do quite a bit of forecasting at work, and I was surprised I had never come across this excellent rubric for anticipating a certain amount of regression to the mean.

Briefly: if you evaluate, say, fifty people on a task that involves some luck as well as skill, like accurately throwing something, then the people who did the very best (or the very worst) on their first attempt are unlikely to do as well (or badly) on a second attempt; their results were probably mostly flukes, and they will tend to ‘regress to the mean’.

If I am evaluating 12 different marketing campaigns and trying to forecast how well they do in future, the same kind of rule applies. The one that did very best was at least partially ‘lucky’, so will not necessarily be the best in future.

The rubric is as follows:
a) If the measure you want to predict has zero correlation with their future values (which you can figure out by viewing historical data), then you should predict that regardless of how they did, they will all perform averagely in future.

b) If the measure you want to predict perfectly correlates with the future, so whatever is the best now will be the best in future, then obviously you should predict that.

c) If the correlation between the present and the future is x%, then you should forecast any present deviation from average performance will decrease by (1-x)%!

That’s the terse version, you can read more about it here.

Evaluation of experiences
How much you liked or disliked an experience would intuitively be based on how long it was, and how much you were liking or disliking it at the time. Something that was unpleasant for 10 minutes should surely be ranked as worse than something that was unpleasant for only 5 minutes.

In practice, this isn’t how we evaluate things at all. We very highly weight our peak enjoyment (or discomfort), and how happy (or unhappy) we were at the very end of the experience, and a little bit the beginning; the absolute duration plays only a small part.

This probably means you should take fewer, shorter holidays, but it also depends on how you weight the importance of what Kahneman calls the “experiencing self” and the “remembering self”, which is quite a tricky philosophical problem.

Life is like an Iterated Prisoner’s Dilemma, except different

As a student I was very interested in the Prisoner’s Dilemma, and the iterated version, in which one must choose to Co-operate with (C) or Defect against (D) another player making the same choice, knowing that you will come out best (and they worst) if you D while they C, but you will both do terribly if you both choose D.

One critical element: you can’t communicate with the other player. In the iterated version, the choice of C and D is effectively the method by which you communicate.

Meanwhile in real life, most of the time, the thing most likely to help you is another human, and the thing most likely to do you harm is also another human, which means interacting with other humans is a pretty crucial business. In particular it’s good to be able to figure out – and influence – who is likely to co-operate (C) with you and who is likely to try to take advantage of (D) you.

I studied maths and physics as a student, but struggled to understand human behaviour. By studying subjects where exam answers were simply right or wrong, and doing quite well at those, I (and I suspect many others in the same situation) thought that I must be quite clever, and the reason I can’t understand human behaviour is because other people are just acting irrationally.

Now, it is true that people act irrationally a lot of the time (see the last Things), but that also includes me, and a lot of the things I couldn’t understand eventually made more sense when I realised that life was like an iterated Prisoner’s Dilemma, except that rounds aren’t discrete or simultaneous and there are multiple and varying pay-off matrices in play all the time.

For example, I noticed people asked “How are you?” but didn’t actually want to know, which seemed irrational. This was laid bare for me in a dentist’s waiting room when one elderly person entered and recognised another, and the following exchange took place:

A: Oh, hello there! How are you?
B: I’m fine, how are you?
A: I’m fine thanks. So [short pause] how are you then?
B: Well, I’ve been having this awful pain in my side, so I went to the doctor last week …

Similarly, as a marketing grad I was sent out with a cameraman to stop people on the street and get their opinions on climate change for a vox pop montage. I would walk up to people and ask them right away, and nobody stopped to answer. The cameraman, who had done this before, told me I should ask them how they were first. This seemed ridiculous, as a person approaching you with a microphone and film camera obviously doesn’t care how you are, they just want to film you saying something. But I tried asking anyway, and suddenly just about everyone was then happy to give their opinion on climate change for the camera.

I realised the whole “How are you” bit is like a tiny move in an iterated Prisoner’s Dilemma in which you are really communicating “I will co-operate with you”, and the other person can demonstrate a reciprocal intention by asking you the same thing back. This then sets the scene for further and deeper co-operation.

Moves aren’t just made in speech either. I thought buttoned shirts were ridiculous in comparison with t-shirts: uncomfortable, more time-consuming to put on and remove, and harder to clean. Why would anyone choose to wear one? But it turns out clothing is a widely understood opening move in our co-operation dialogues. We learn that we can estimate by someone’s clothes how likely they are to co-operate with or benefit us in certain ways; uniforms do this in an overt way, but even a slight deviation from your company’s dress code sends a signal.

Cat and girl covered this, of course.

More generally, these kinds of behaviours make a society cohesive – by doing what everyone else does, you tacitly signal that you are a good co-operator in your society. At the same time it can make society conservative, as anyone deviating from locally normal behaviour (even for rational reasons) might be read as less co-operative, and so they will encounter more friction.

Location Encoding

What3Words (W3W) assigns each 3m x 3m square on Earth a three word designation (e.g. Each.Useful.Shark). This makes it fantastic for real-world treasure hunts, so long as the participants can use the mobile app, and I’ve made a couple of events that leveraged it to (I thought) rather fun effect.

However, Richard brought to my attention that among people interested in the general problem of addressing, W3W is viewed very negatively. Why is that?

Reading up on the subject (this post was particularly useful), it seems like W3W lacks some attributes a truly general Location Encoding system should really have. But what really annoys people who understand this area well is that W3W tends to put out PR that claims to be strong in the areas it is weak. In brief:

  • W3W is a private company (probably hoping to be acquired by another one). Location/address is something that works best when it’s a standard, and having a private company own a standard leads to conflicts of interest. (See the Microsoft ‘Embrace, Extend, and Extinguish’ strategy for an example of how private companies fight public standards).
  • W3W is not a good solution for emergency situations (calling an ambulance to your location; calling a Fire Engine to a location you see on fire).
  • W3W is not error tolerant and has no hierarchy (e.g. one mis-spelt/mis-remembered character has very little chance of being corrected, in contrast with traditional addresses, where post accurately addressed apart from saying “Brighton” instead of “Hove” still successfully gets to Hove with the word Brighton angrily crossed out and corrected).

Still, I do think W3W has some value, and it would be unreasonable to discard it entirely because it can’t do everything – indeed, no address system can meet all the requirements we might ask of it.

Google Maps’ location-sharing functionality covers many options, and has the benefit of being already available in many people’s pockets, but I recently had a situation where both intuitive addressing and Google failed: meeting at the “Joe’s Café in Soho” does not specify a unique location, and the inaccuracy of GPS meant a shared Google location didn’t resolve the matter either. W3W is actually pretty excellent for this sort of spontaneous meeting. All things considered though, the best thing about it really does seem to be the opportunities for Treasure Hunts.

Video Games

I’ve played some games since the last Things, some of which I recommend, and some of which I don’t!

Baba Is You (Steam, Humble store, itch.io, Switch)

The mostly instantly-gettable trailer seems to be in this tweet:

For Things readers partial to self-referentiality and all things meta (and I know there are a bunch of you), this is certainly worth a look. As the above video shows, the game is played by pushing things around, including words that define the rules of the game.

In practice it’s even more mind-boggling than I expected, but not actually as much fun as I had hoped.

Celeste (Steam, itch.io, Switch, PS4, Xbox One, Pico-8 prototype)
If you want a platform game with puzzle elements and enjoy dying repeatedly while you slowly get better at doing difficult things, this is extremely the thing for you. The soundtrack is quite lovely too.

My save file, 44 hours, 10,000+ deaths, is a review in itself (implicit spoilers split-by-level version is here):

Lovers in a Dangerous Space Time (Steam, Switch, PS4, Xbox One)
Looks exactly like what it is: a rather nice local co-op shooter in which you and some friends control characters running around a ship manning the helm/gun/shield/panic-button and rescuing animals in space.

Thomas Was Alone (Direct for Mac or PC, iOS, Android, Steam)
A kind of “self-aware” puzzle-platformer that everyone was going on about a few years ago; I finally tried it and found it dull and not at all as funny as it seemed to think it was, with frustratingly vague platforming physics.

The Legend of Zelda: Breath of the Wild (Switch and Wii U only)
An open world epic about saving the world which I tried – and failed – to enjoy for about 12 hours before giving up. Almost everything about it felt like a chore and I couldn’t understand how it gained such universal acclaim.

After a weirdly long time adventuring in my underwear, I finally found someone who would sell me clothes.

To pick one example, a complaint I had heard from some was that weapons could only be used a certain amount of time before they fell apart and had to be replaced. I thought this was just misguided resentment of a feature clearly designed to add strategy to battles, but then I spent the first hour of the game picking up and discarding 37 differently ineffectual sticks (you can only hold a few at a time) and fought about 5 monsters. It felt like I spent substantially more time managing my ineffectual stick inventory than having battles, and so the whole weapons feature then felt like busywork.

Loading screen tip: read loading screen tips. By definition, this is useful to nobody.

Perhaps the worst part is I still feel like I “should” give it more of a chance, or at least get more entertainment for my money, when rationally I know there must be more games out there like Celeste which were an order of magnitude more fun for a fraction of the price.

Horizon Zero Dawn (PS4 only)
Snakes on a Plane is a great title for a film because it perfectly sells the premise. Horizon Zero Dawn is a terrible name in that regard, but the promotional art sells the premise perfectly:

Tribal humans hunting robot dinosaurs! Which immediately looks like something I want to try, and also very quickly raises the question of just how such a scenario could even come about. It turns out the game is exactly about those two things: hunting robot dinosaurs while figuring out how this happened!

It certainly stands on the shoulders of giants in terms of the use of ‘Open World’ game conventions, but it adds a few interesting ideas and does just about everything you would hope to extremely well.

*I made up “chi-notation”, because I couldn’t think of a clearer example, it seemed funny, and illustrates the point just as well.

- Transmission finally ends

Things 133: Overreacting, audio history of sampling, internet vs time, meta-meta-analysis

Comics – Overreacting

Jemma Salume has an excellent series of comics about overreacting to things (and also learning to cook, and dating). They’re compact and hyperbolic, which is how I like my comics, and also how I like my toy universe model geometries, hahaha.

Music – Raiding the 20th Century
This remains my favourite mix, and with the ten-year anniversary upon us I was surprised to realise I had never put it in Things.

In 2004, DJ Food (aka Strictly Kev) made a 40-minute mix for XFM chronicling the history of ‘cut-up’ (essentially sample-based) music which he called ‘Raiding the 20th Century’. Shortly afterwards he read Paul Morley’s book ‘Words and Music’ which did much the same thing and covered much of the same material. Paul Morley also coined the phrase ‘Raiding the 20th Century’ twenty years earlier. Taking note of this big flashing fate-arrow, they got together, recorded Paul reading key parts of the book, and created a new hour-long mix of the material.

The mp3 is available over on archive.org, the track listing is here, and you can go ahead and listen to it right here:

It’s about 20 minutes before the ‘history’ really starts, and while Morley’s commentary then explains and introduces many of the tracks and samples, many more are used without comment. Over the years, as I learn more about music history, more and more of them are making sense, which is very satisfying. As one of the samples used states: “every time you listen to this recording, something will happen.”

Links – Time and the Internet
As we build up an ever larger historical archive of material online, the date something was originally published becomes more important, and something we’ll need to become more aware of (assuming we avoid internet decay).

I like the approach of the BBC, which appears to maintain the CMS that articles originally appeared in (for example, this report from September 11th 2001, or the Mammal-of-the-month November 2002). That’s still not quite enough to avoid the confusion that may arise from incautious Googling for events that recur. Also, try to work out when this was written.

Anyway, if you would prefer a cogent discussion of the topic rather than a selection of semi-random BBC links, then I highly recommend Joanne McNeil’s piece on the subject here, in which she says things much more precisely than I have been, like this:

“Digital content appears with minimal visual language distinguishing yesterday from tomorrow and today. Now habits have emerged in which we communicate with the past and even mistake it for the present.”

(Also, see this Cat and Girl comic).

Video – Brett Domino
Looking through previous editions of Things, I was surprised to find I’d never featured Brett Domino, who does a range of silly-but-clever, bad-but-good things with music. I think the most impressive is his medley of the top 10 pop songs at the time he hit 10,000 Twitter followers, which culminates in a surprisingly effective montage finale:

Link – Scientific truth, researcher bias, and parapsychology
In a meta-analysis, the results of many similar experiments are analysed together in order to gain statistical power and shed more light on subtle phenomena. For example, if it’s a very small effect, some experiments won’t yield any results, perhaps causing us to question the experiments that do find an effect; by considering all these experiments together, we can better assess if we’re seeing Type I or Type II errors. Also, if you suspect the result may only come about due to sloppy methodology, you can see if there is a correlation between how ‘rigorous’ a study is and the size of any effect that it finds – if more rigorous studies come up with smaller effects, that’s quite suggestive.

Years ago I read about a meta-analysis of research into psychic abilities, and the results were not clear-cut one way or the other, despite taking a comprehensive overview of the relevant studies. I thought that was very interesting, because it suggested that either psychic abilities were real, or the scientific method wasn’t as infalliable as I had thought (or both).

Many more studies have been performed since, and this problem does not seem to go away. A strong clue seems to be the experimenter’s bias effect: a researcher who believes that an experiment will yield a certain outcome is more likely to end up getting that outcome, even if they are not intentionally manipulating the experiment to that end.

Of course, experimenter’s bias is quite a tricky and small effect to prove, so what you need to do is a meta-analysis across the various studies into it. But when different people conduct this meta-analysis, they reach different conclusions: some find the experimenter’s bias effect exists, and some find it doesn’t!

If you’ve been following closely to this point, you can guess the logical next step: we need a meta-meta-analysis of the experimenter’s bias meta-analyses, to see if meta-experimenters that believe the experimenter’s bias effect exists were more likely to find exactly that result in their meta-analysis! Brilliantly, and also alarmingly, this meta-meta-analysis was conducted and concluded that, yes, that’s exactly what happens: there is indeed a meta-experimenter’s bias effect. So the question now is… does the experimenter’s bias effect actually exist?

I found all this out from a brilliant essay by Scott Alexander, which includes all the juicy references and finishes with an amusingly modified Star Wars quote, so is pretty much perfect.

Puzzle – Sequel Naming
For some media, major new updates are numbered: movies (Iron Man 3), TV series (Game of Thrones Season 4), video games (Call of Duty 4) are obvious, but it’s also dominant in operating systems (Windows 8), Consoles/phones (Playstation 4, Samsung Galaxy S4) and even classical music (Bach’s Cantata No. 140),

Other things don’t seem to work that way, notably books (A Clash of Kings, rather than A Game of Thrones 2) and albums (Björk – Post, rather than Debut 2), but also theatrical productions (admittedly much rarer, but it’s Love Never Dies, rather than Phantom of the Opera 2)

(Of course, sometimes people mix their strategies with hilarious results: Call of Duty 4: Modern Warfare 2, BT Infinity 2, Xbox “One”)

The contrast is most stark in TV series versus books. So the question is this: why do we have Game of Thrones Season 2 on TV, but A Clash of Kings in book form?

Tim Mannveille tweets as @metatim, and previously worried about old things disappearing from the internet