How 'Science Vs' Accidentally Invented A Gender Dysphoria Desistance Statistic

It is not easy to report on this subject

Mar 29, 2019

A Critique of the ‘Science Vs’ Episode on Being Transgender, Part 2

This will be picking up from Part 1, which ran late Wednesday night. What follows won’t make sense unless you read that first. Here’s how things are organized in the two posts:

Part 1, Wednesday night’s post:

I. A Quick Word on Science Vs and Something it Does Quite Well

II. The First Error: Yes, the DSM-5 Views Gender Dysphoria as a Mental Disorder

***

Part 2, this post:

III. Background Info on the Desistance Debate, and What Desistance-Deniers Often Get Wrong

IV. The Second Error: The Phantom Desistance Statistic

V. Wrapping Up: It’s Hard to Report on Gender Dysphoria

III. Background Info on the Desistance Debate, and What Desistance-Deniers Often Get Wrong

There’s no area of the gender dysphoria discussion where some journalists and activists more frequently disseminate misleading information than in the conversation about desistance. Now, it doesn’t help matters that ‘desistance’ is defined in different ways at different times. In some contexts, ‘desistance’ means “When someone who used to identify as trans no longer does,” particularly in the cases of young people — here the term is pretty much interchangeable with ‘detransitioner.’ Other times, it means “When someone who used to experience gender dysphoria no longer does.” Because those two categories, being trans and having gender dysphoria, overlap a lot but aren’t, in current usage, equivalent, this means that sometimes people using the same word are talking about different things.

I think the second definition, centered on gender dysphoria, is far more useful when discussing young people and puberty blockers and hormones. You don’t go on puberty blockers because you are trans — you go on them because you have severe and persistent gender dysphoria that puberty could exacerbate. The same goes for hormones and surgery. There appears to be a sizable and growing number of kids (and adults) who identify as trans but who don’t have intense dysphoria, and their experiences — whether or not they maintain their trans identity or shed it as they continue to grow and explore — aren’t quite relevant to the question of medical interventions.

So when I use ‘desist’ it’s in the sense of someone’s gender dysphoria abating over time, regardless of how they identified during that period. That’s what should matter to young people considering physical interventions, and to parents who want to know the likelihood that a kid who is quite dysphoric at age 5 will feel that way at age 10, around when it might be time to start looking into puberty blockers.

Now, the best evidence we have suggests a significant number of kids with GD desist over time, and that the younger a kid is, the more likely it is their GD will desist. But complicating matters is another finding, backed up by a bit less research but still potentially important, that there appears to be a correlation between how severe a given case of childhood GD is — captured by, among other measures, whether a kid says he wants to be the other sex or is the other sex — and how likely it is that it will persist in the long run.

It’s vitally important, whenever discussing this subject, to caution parents against confidently predicting their kids’ long-term trajectories on the basis of zoomed-out desistance statistics. The fact is that for a given kid, there’s no way to know for sure. There are kids with severe GD who desist. There are kids with little or no GD who develop it after puberty, big-time, who transition, and who live happily ever after as trans adults. If you are confident your 5-year-old with GD will desist, you’re doing it wrong. If you are confident your 5-year-old with GD won’t desist, and that blockers and hormones are definitely in her future, you are also doing it wrong. Parents should allow themselves and their kids to inhabit an uncertain, exploratory space for a while as young ones figure out who they are — that’s a point that has been hammered home to me by some of the leading authorities on this subject.

It goes without saying that when conservative-minded parents pressure a child into “acting like” a boy or a girl when they aren’t doing so, or react unsympathetically to indications that their child is experiencing distress as a result of gender dysphoria, this can do real harm. Simply convincing parents to be more accepting of gender nonconforming behavior would, on its own, alleviate a lot of needless childhood suffering.

This sort of parenting error goes both ways, though, as I noted in my Atlantic piece:

[P]rogressive-minded parents can sometimes be a problem for their kids as well. Several of the clinicians I spoke with, including Nate Sharon, Laura Edwards-Leeper, and Scott Leibowitz, recounted new patients arriving at their clinics, their parents having already developed detailed plans for them to transition. “I’ve actually had patients with parents pressuring me to recommend their kids start hormones,” Sharon said.
In these cases, the child might be capably navigating a liminal period of gender exploration; it’s the parents who are having trouble not knowing whether their kid is a boy or a girl. As Sharon put it: “Everything’s going great, but Mom’s like, ‘My transgender kid is going to commit suicide as soon as he starts puberty, and we need to start the hormones now.’ And I’m like, ‘Actually, your kid’s just fine right now. And we want to leave it open to him, for him to decide that.’ Don’t put that in stone for this kid, you know?”

So this is all very complicated, and can lead to periods of uncertainty that can be difficult for parents to cope with. But all that said, there’s a lot of evidence that desistance is, at the very least, common enough that parents should be aware it is a fairly likely possibility. As the sex researcher James Cantor has pointed out (if you click that link, make sure to read the caveats that are coming up), every study ever published on transgender and gender nonconforming (TGNC) kids has produced evidence for a high desistance rate. That’s how the meme of an 80% desistance rate, which is probably an overestimate, caught on.

But the desistance literature has been critiqued harshly for years by some activists and journalists, mostly on the grounds that the studies purporting to show high desistance rates were really tracking large numbers of kids who weren’t gender dysphoric to begin with — rather, those kids were merely gender nonconforming, as in the case of a natal boy who likes to wear dresses from time to time but who doesn’t really feel like or want to be a girl in any deep and persistent sense. They were never going to grow up to be trans, say these skeptics, because they were never really dysphoric in the first place. Therefore, counting them as ‘desisters,’ which the literature did, artificially inflated the apparent desistance rate.

This is probably true when it comes to older research conducted during periods when gender dysphoria and nonconformity was less understood than it is in the 21st century. So I do think Cantor’s (otherwise helpful) rundown of all the studies could use just a bit more detail:

I also think someone unfamiliar with the specifics of the literature could misinterpret the denominators on this page — these studies tend to include kids who were both threshold and subthreshold for gender identity disorder (the condition being used at the time, since it was the era of the DSM-III and -IV). That’s an important distinction, and there’s a risk people will misread the results of the bottom study, for example, as “out of 127 kids with severe gender dysphoria, just 47 ended up identifying as trans in the long run,” which isn’t quite what the researchers found. Rather, “127 kids” refers to the total sample, some of whom had severe gender dysphoria and others of whom did not.

Anyway, I would point anyone curious about desistance to those four most recent studies, all of which came out of the Center of Expertise on Gender Dysphoria at the VU University Medical Center Amsterdam, also known as “the Dutch clinic,” or the Gender Identity Clinic at The Centre for Addiction and Mental Health in Toronto. These are bigger, more recent studies, and on its face it seems less likely that the clinicians who published them would have confused mere gender nonconformity with gender dysphoria. (As my reporting showed, the Toronto clinic was shut down and its staff fired in late 2015 partially as a result of false and unverified rumors spread about its practices and its head, Ken Zucker, by certain activists and journalists, and by CAMH itself in its justification for the decision. Zucker subsequently sued CAMH and the hospital eventually acknowledged some procedural wrongdoing, giving him a settlement worth hundreds of thousands of dollars.)

But the skeptics have a response: They might not be all that old, they say, but these studies, too, are flawed, because the DSM-IV criteria for childhood gender dysphoria that they used were too loose and lumped in many merely gender nonconforming kids. Experts seem to generally accept this is true a little. But to claim it would affect the results all that much is, in my view, a leap.

Here are the DSM-IV criteria:

Skeptics of these criteria harp on criterion A a great deal, particularly the fact that A(1) — a “repeatedly stated desire to be, or insistence that he or she is, the other sex” — isn’t required for a kid to qualify as having gender identity dysphoria. So a natal boy can be diagnosed with GID without ever stating he is or wants to be a girl. This is true. But look at the other criteria he would have to meet, particularly B and D, which focus heavily on distress and impairment. In my view, as I have argued elsewhere, it would be hard for most merely gender nonconforming kids who didn’t have dysphoria to meet these criteria, even if a few did.

Try to imagine what this kid would look like — I’ll use ‘he’ pronouns since he doesn’t identify as a girl. But, as per the criteria above, he (A) prefers girls’ clothes, playing as female characters in fantasy settings, and has an intense desire to be included with girls when they’re playing princess games and so forth (we’re leaving out (A(1)). In addition, he (B) has a strong aversion to doing anything that would mark him, behaviorally, as being a boy, he (C) doesn’t have an intersex condition that might provide some other explanation for what’s going on, and (D) this is all causing him distress and impairment in important areas of functioning. I don’t think it would be crazy to suggest a kid like this — one who appears to experience serious distress at being and being seen as a male even if he doesn’t explicitly verbalize the desire to not be one, or the belief that he isn’t one — has gender dysphoria. I’d argue what’s going on is far more intense than what’s going on with a boy who just likes to dress up as a princess occasionally. (I feel like this could open up yet another whole can of worms in a blog post that’s already rather wormy, so I’ll state it briefly: I also don’t understand why identifying as the other sex should be a prerequisite for being diagnosed with childhood gender dysphoria anyway, as long as the other criteria are reasonably tight. As I explained in Part 1, there are plenty of people who have gender dysphoria — they find being seen as male or female, or having a male or female body, distressing — but who don’t identify as the other sex. Surely kids could experience something similar?)

Either way, critics of the DSM-IV tend to leave out the fact that the manual explicitly instructs clinicians not to wrongly diagnose merely gender nonconforming kids as having gender identity disorder. From the Differential Diagnosis section of that part of that DSM:

Gender Identity Disorder can be distinguished from simple nonconformity to stereotypical sex role behavior by the extent and pervasiveness of the cross-gender wishes, interests, and activities. This disorder is not meant to describe a child's nonconformity to stereotypic sex-role behavior as, for example, in “tomboyishness” in girls or “sissyish” behavior in boys. Rather, it represents a profound disturbance of the individual's sense of identity with regard to maleness or femaleness. Behavior in children that merely does not fit the cultural stereotype of masculinity or femininity should not be given the diagnosis unless the full syndrome is present, including marked distress or impairment. [emphasis in the original]

Could the authors have made it any clearer that they did not intend for merely gender nonconforming kids to be lumped in under the GID diagnosis? That said, some experts do seem to agree that these criteria were slightly too loose, and that’s why they were tightened in the DSM-5. So desistance estimates derived from the DSM-IV criteria for GID probably do overstate the rate of this phenomenon as a result of a bit of diagnostic fuzziness.

But some desistance-skeptics haven’t just argued that the estimates generated by these studies should be revised downward a little; rather, they’ve repeatedly claimed that the studies are fundamentally untrustworthy because the clinicians in Amsterdam and Toronto were virtually incapable of distinguishing gender dysphoria from mere gender nonconformity. Zack Ford of ThinkProgress, for example, that publication’s leading authority on trans issues, once told me via Twitter that “The research that informs desistance is bunk. IF desistance exists, we have nothing currently available to substantiate it as a claim.” To say there is “no evidence” to support Ford’s claim would be too generous, given the existence of evidence pointing in the opposite direction — unless, of course, one simply throws out those four most recent desistance studies. To take a representative claim from his published work on this subject, Ford writes in one article that “The children in those studies who ‘desisted’ in their identities were the kids who weren’t actually transgender to begin with.” That’s a really strong and certain claim. He’s not saying ‘some of the children’ who desisted weren’t trans (again, what really matters here is whether they were dysphoric) — he’s saying ‘the children,’ full stop, weren’t trans. You’d think he’d provide some hard evidence that such rampant misdiagnosis was going on at two of the most famous gender clinics in the world, but he doesn’t.

Plenty of other desistance skeptics have made similarly strong claims, and have rarely backed them with much evidence. Ford’s article links to what is probably one of the most widely circulated such articles, “The End of the Desistance Myth,” written by the leading trans activist Brynn Tannehill. She writes of a 2011 study published by clinicians at the Amsterdam clinic that it “did not actually differentiate between children with consistent, persistent and insistent gender dysphoria, kids who socially transitioned, and kids who just acted more masculine or feminine than their birth sex and culture allowed for. In other words, it treated gender non-conformance the same as gender dysphoria.” This is a confusing claim given that the linked-to study is a qualitative followup that doesn’t even present a desistance-rate estimate of its own, and that the text of it almost entirely contradicts her critiques. But that general claim about the recent desistance research — that it recklessly lumped in all sorts of different kids — is par for the course in the desistance-skeptical work of Tannehill and others.

In another commonly cited article, for example, Kelley Winters writes that “Gender nonconforming children with no actual evidence of gender dysphoria were very easily misdiagnosed with ‘Gender Identity Disorder of Children’ because of flawed diagnostic criteria the [sic] DSM-IV.” As evidence for this claim, she cites a blog post she herself authored in 2008 and a presentation she posted in 2014, but these sources mostly just reiterate the standard critiques of the DSM-IV, and make some fair points about estimates like 80% probably being too high. They offer little evidence to support the very strong claim that merely gender nonconforming kids were “very easily” misdiagnosed with GID by clinicians following the DSM-IV criteria. And Winters, like other strong critics of the desistance research, simply ignores the fact that the DSM-IV explicitly instructed clinicians not to make the mistake she claims it caused them to make chronically.

These attempts to undermine the desistance literature are largely motivated, I believe, by the reasonable concern that the concept of desistance has and will be used as justification to prevent young people from transitioning who should transition — which, again, would be a misreading of the literature given the folly of attempting to apply zoomed-out statistics to an individual young person in the throes of various developmental changes. But the question of whether desistance is a ‘myth’ or close to it (no) is different from the question of whether the concept could be used, if badly misinterpreted, to harm people (yes). The former belief has caught on, though, and appears to be held by a chunk of the trans community, and by many of its most dedicated advocates. It has arguably become something of a perceived proxy for one’s dedication to trans rights: People who are on the right side of this issue acknowledge that the desistance research is fatally flawed, because to believe otherwise is to open up trans people to harm.

But this group of young women who desisted, not to mention countless others like those I interviewed in my Atlantic piece, would likely have strongly negative feelings about the idea that desistance is too rare or dangerous to discuss. In fact, whenever I have looked for desisters in my own reporting, they have been easy to find — pretty much what the literature would predict. They’re just less likely to pop up in the public discourse because they don’t really have anything urgent to advocate for in the way trans people are forced to advocate for their basic dignity, nondiscrimination, and medical services, and because many (perhaps most) mainstream journalists actively ignore desisters and treat them as too politically inconvenient to report on.

At the risk of beating a dead horse, one more time: The commentators whose work I am critiquing here, and others, are correct in pointing out that the DSM-5 defines gender dysphoria in a tighter way than the DSM-IV defined what used to be called gender identity dysphoria. To the extent they are arguing against the unquestioned acceptance of that 80% statistic, I agree with them. But their stronger claims that the Toronto and Amsterdam clinics were incapable of, or unwilling to, differentiate between merely gender nonconforming and gender dysphoric kids are pretty easily debunked by the available evidence when you read those actual studies.

Now, those studies did measure gender-nonconforming behavior, but there’s a reason for that: All else being equal, kids who engage in so-called ‘cross-sex’ behavior, particularly play, are more likely to grow up to be gay or trans than kids who are more ‘traditional’ in that regard. It’s diagnostically useful to measure how a kid acts and plays, in other words, as long as the clinician doing the measuring understands that it’s only part of the story. It can also be useful when the goal is to figure out what parental concerns precipitated a child’s trip to a gender clinic in the first place. If evaluating little Tina reveals that she loves rough and tumble play and to dress up like a fireman, but doesn’t have a trace of actual gender dysphoria, and interviews with her conservative parents reveal deep consternation on their part at her tomboyish play and dressup habits, the way forward is clear: Focus less on Tina, and more on helping her parents realize there’s nothing to be ‘treated.’ Tina just likes to wrestle and dress up like a fireman. Maybe she’ll grow up to be gay — and maybe not.

In addition to measuring gender nonconforming behavior, all four of these studies measured desistance quite specifically. In fact, in all four the clinicians didn’t solely rely on the DSM-IV criteria, but also used psychometric instruments whose purpose is to measure the severity of gender dysphoria. Let’s briefly go through them study by study, to hopefully put to bed this rumor that they suffered from backbreaking flaws that render them disposable.

A subset of these four studies’ total sample, it should be said, were evaluated on the basis of the DSM-III criteria, but that manual’s GID listing, too, contained language like this:

The essential features are a persistent feeling of discomfort and inappropriateness in a child about his or her anatomic sex and the desire to be, or insistence that he or she is, of the other sex. [...] This is not merely the rejection of stereotypical sex role behavior as, for example, in ‘tomboyishness’ in girls or ‘sissyish’ behavior in boys, but rather a profound disturbance of the normal sense of maleness or femaleness.

And for both boys and girls, a required criterion in the DSM-III’s version of childhood GID was a “Strongly and persistently stated desire to be a [boy/girl], or insistence that [s/he] is a [boy/girl]” — the criterion whose absence in the DSM-IV was highlighted as a reason to doubt that manual’s definition of GID.

Here’s a table from Wallien and Cohen-Kettenis (2008):

It shows that at intake, the kids in this study were evaluated with both the Gender Identity Interview for Children and the Gender Identity Questionnaire for Children.

Here’s an excerpt from Drummond et al, 2008:

It shows that the researchers didn’t just use the DSM-IV criteria, but also another, composite score which combined both measures of cross-sex behavior as well as both the aforementioned GICQ and Zucker et al’s Gender Identity Interview.

That latter one looks like this:

Despite being more than 25 years old, this instrument contains exactly the sorts of items the critics trying to discredit these studies are concerned about! It is clearly geared at allowing the clinician administering it to better understand where a child’s gender feelings are coming from, and to differentiate between a child (say) feeling like or wanting to be a girl as a result of gender dysphoria, and wanting to be a girl for more superficial reasons. Sometimes kids will express this sort of wish because girls are treated nicer by teachers or boys get to do rough-and-tumble play, or whatever — one of my best friend’s wonderful, clearly non-dysphoric daughter expressed a desire to have her hair cut like a boy’s so she could run faster, since boys run faster.

Moving on, here’s a table from Devita Singh’s 2012 dissertation out of the Toronto clinic:

Singh’s is the only non-traditionally-published study of the bunch. I still find it valuable and would argue for its inclusion as part of the desistance literature given the relative thinness of that literature and given that, as an accepted and published dissertation, it was exposed to a form of peer review. But toss it out if you want — it won’t change the general takeaway much. Either way, this table clearly shows, once more, that the clinicians used some measures specifically geared at measuring gender dysphoria, not just gender nonconforming behavior.

Finally, here’s an excerpt from the “Measures: Childhood” section of Steensma et al (2013), the single largest peer-reviewed study published on this subject:

Again, it’s quite clear the authors measured exactly the stuff that the critics are claiming they didn’t measure.

Okay, deep breath: Got all that? The four most recent desistance studies didn’t just rely on the (pretty solid, in my view, anyway) DSM-IV criteria — they also administered psychometric instruments geared at measuring gender dysphoria itself. There is effectively no textual evidence for the now-widely-accepted-by-some claim that these studies haphazardly lumped together genuinely gender dysphoric and merely gender nonconforming kids. But that hasn’t stopped many outlets, including Science Vs, from spreading it.

IV. The Second Error: The Phantom Desistance Statistic

Back, finally, to the episode. Here’s how the show explains the desistance controversy to its listeners, drawing on an interview with Laura Edwards-Leeper, a very talented clinician who I featured in my Atlantic article. I’m inserting numbers so I can take the points on one by one.

Laura told us that some studies seem to show that lots of trans kids change their mind. And here’s what those studies are doing: They take a group of kids from a gender clinic and follow them ... and they find that most aren’t trans when they grow up. [56] [57] [58] [59] But Laura says we have to be careful with what this means… because (1) a lot of these studies are just scooping up all the kids at the clinic [60] [61] [62] …not just those who say they’re trans. And kids end up at gender clinics for all sorts of reasons …
LEL: They can be a boy and wear dresses and like sparkly things, and that doesn't mean they're a girl,
WZ: Or they've just worked out that dresses are really fun and many men are missing out? Exactly
(2) LEL: I think what sometimes has happened in the earlier research as they've lumped all of these gender diverse kids into the same category.
Basically Laura says these studies are tracking all kinds of kids - gay kids… tom boys… boys who like nail polish - and then saying: hey! Most of them aren’t trans. True, but not particularly useful. Laura says you really need to focus on the kids who are having issues with their gender identity.. So for example… the kids who strongly agree with statements like … “I wish I had been born a boy” or “Every time someone treats me like a girl I feel hurt.” [63],[64] and Laura says has met a LOT of kids like this.
Sometimes we have these little kids who are so like just disconnected from their body, and they just really feel so strongly that this is not the body I'm so supposed to have-01
(3) The only study we’ve found that zoomed in on kids like this… found that out of 45 of them… 44 grew up to be transgender.[65],[66],[67]. So only one kid didn’t. Now this is just one study, but it lines up with Laura’s experience too.
WZ: In your practice have you seen any kids that are super insistent and then do ultimately stick with the gender they were born with? switch?
LEL Yes, I have, Yeah I have not many, not many of them. But if we're talking about the kids who are saying things like, “God made a mistake” or “mommy why did you make me this way” You know or “I want to cut off my penis.”
WZ: Oh my God. And you hear kids saying that stuff.
LEL Absolutely. I mean those are the kids that tend to be more likely to continue.
From the best evidence we could find, the majority of kids who insist that they’re trans… won’t grow out of it. (4) But Laura says this whole argument over exactly how many kids will change their mind is kind of missing the point. Because ultimately, no one can predict which way any kid will go… so you have to treat each little tacker as they come. [emphasis in the original]

(1) The claim that “a lot of these studies are just scooping up all the kids at the clinic [60] [61] [62] …not just those who say they’re trans” is misleading with regard to the most recent studies. Yes, there is a thing called a “consecutive referral study” in which you do just that — you scoop up all the patients who were referred to a given clinic over a given span. But the clear implication here is that these studies were being sloppy, that they were haphazardly confusing gender dysphoria and mere gender nonconformity. As I just explained, that isn’t true for any of the most recent studies, all of which came out of two gender clinics that carefully assessed their patients at intake.

(2) “I think what sometimes has happened in the earlier research as they've lumped all of these gender diverse kids into the same category,” Edwards-Leeper told Zukerman. She uses “earlier research” here for a reason — I don’t think she’s referring to the most recent studies. That view is supported by her claim about a boy wearing sparkling clothes: If you read about how the Amsterdam clinic assessed kids at intake, it’s basically impossible to conclude that Thomas Steensma and his colleagues would have misdiagnosed a boy who simply liked wearing sparkles as being gender dysphoric. (Edwards-Leeper visited the Amsterdam clinic back in the Aughts — that’s how she helped bring the puberty-blocking protocol pioneered there to the States for the first time — so she’s familiar with how Steensma and his colleagues do business.) But the whole segment blurs this distinction between earlier studies and the more recent ones. It effectively — and unfairly, in my view — presents all the studies finding desistance to be pretty common as suspect, similar to how many activists and journalists have in the past.

(3) “The only study we’ve found that zoomed in on kids like this… found that out of 45 of them… 44 grew up to be transgender.[65],[66],[67]. So only one kid didn’t.”

This is the big error — the accidentally made-up statistic. And I think what happened is that Science Vs’s producers fell for some confusing language in the 2013 Steensma et al study. In short, the authors use ‘persister’ and ‘desister’ to refer to kids who kept coming to the clinic, and those who stopped coming to the clinic, respectively — regardless of whether their gender dysphoria, well, persisted or desisted (unsurprisingly, the kids who kept coming to the clinic were the ones who kept seeking treatment, because their dysphoria didn’t desist). It’s unfortunate phrasing that has led to a lot of confusion. In my original reporting on this study, for example, I got fooled by it, falsely claiming the study offered weaker evidence for desistance than it really did — details here if you want to get even deeper into the weeds. In my defense, a lot of people made this mistake. Though in my non-defense, all it takes is a very careful read of the paper to understand what’s going on.

Here’s the first part of the footnote Science Vs uses to justify the claim that 44 out of 45 deeply dysphoric kids persisted. It consists mostly of a direct chunk of text from the study, with a parenthetical comment from one of the show’s producers bolded by me:

[65]Gender Identity and Body Image. Adolescents’ reports of GD and body image were compared across persisters and desisters (Table 4), and showed that persisters reported more GD than desisters in the mean total scores of both the GIIAA and the UGDS. Clinically, for the GIIAA, scores of less than 3 indicate GD;16 87.2% of the persisters met the criterion compared to 0% of the desisters. For the UGDS, scores of more than 40.0 indicate GD (Steensma, Kreukels, Jurgensen, Thyen, de Vries and Cohen-Kettenis, unpublished material, 2013); 97.9% of the persisters met the criterion compared to 2.2% of the desisters (1 bisexual, natal girl) (if 1 = 2.2%, then the total was 45- MH)

As is clear from that parenthetical, the producers extrapolated the idea of a 45-kid sample from the fact that the one kid who met the criteria for GD accounted for 2.2% of “the desisters.” The problem is that the phrase “97.9% of the persisters met the criterion compared to 2.2% of the desisters” is employing the terms ‘persisters’ and ‘desisters’ in that confusing sense I explained above — this translates to “97.9% of the kids who kept coming to the clinic met the criterion for gender dysphoria compared to 2.2% of the kids who stopped coming to the clinic.” Moreover, these measures were conducted at followup, not at assessment.

So here’s what the paper really says: Just 2.2% of the kids who stopped coming to the clinic had clinically significant gender dysphoria at followup.

And here’s how Science Vs interpreted it and presented it to the public: Just 2.2% of the kids whose gender dysphoria desisted had gender dysphoria at their initial intake.

That’s my interpretation of the error, at least, and for what it’s worth the paper’s lead author, Steensma, agreed with me: “You are absolutely right in your interpretation,” he said in an email. “Table 4 in that paper refers to the intensity of experienced gender dysphoria at the time of follow-up.” I also asked him if the producers of the show had tried to contact him when they were first putting it together, and he replied that he was pretty sure that they had, but that he had been too busy to respond. (As of yesterday, there had been some email back-and-forth between Steensma, Zukerman, and a producer at Science Vs — I got the sense the latter two were pretty aggressively trying to get this all fixed.)

From a science journalism perspective, this is a pretty serious error: The producers of Science Vs accidentally fabricated an entire study finding and disseminated it to the public. They should recut the episode to take this claim out (no one reads corrections). It’s extremely misleading, in that it runs quite counter to every published study on this subject — studies which listeners are more or less told they can safely ignore due to their supposed flaws.

(4) “But Laura says this whole argument over exactly how many kids will change their mind is kind of missing the point. Because ultimately, no one can predict which way any kid will go… so you have to treat each little tacker as they come.”

I’m glad Science Vs included this. It partially salvages a section that introduces a great deal of confusion and misinformation about desistance, and it’s a point Edwards-Leepers emphasized repeatedly to me during our many hours discussing her work. Every kid is an individual. If you’re a parent or a clinician caring for a kid with gender dysphoria, the desistance stats should be in the back of your mind, somewhere — this is something that happens, and not infrequently — but they should never push to the forefront, unduly influencing what decisions you make on that child’s behalf. Or, as another clinician put it to me: All that really matters is that we know the desistance rate isn’t close to 100%, and it isn’t close to 0%.

***

On top of all the aforementioned misunderstanding about what the Steensma study does and doesn’t show, some people have claimed that it should be tossed out of the desistance conversation on the basis of a KQED interview in which Steensma told Jon Brooks that, as Brooks put it, “citing these findings as a measure of desistance is wrongheaded, because the study was never designed with that goal in mind.” Rather, it was designed to look at indicators of persistence and desistance — and found, among other things, that the more intense GD is at a young age, the more likely it is to persist past puberty.

So should this paper simply not be cited at all as evidence when the desistance controversy comes up? Should we ignore the fact that it seems to provide solid evidence that desistance is fairly common? Not quite; I’ve corresponded a fair bit with Steensma and believe this, too, is a misunderstanding. First, one reason Steensma thinks an overall desistance estimate shouldn’t be extrapolated from this study is that, because of its age range, it likely underestimates the desistance rate at his clinic. As he explained to me in an email last year, since the study examined a relatively older cohort of youthful patients at the clinic, younger kids — who are more likely to desist — were underrepresented, likely skewing the desistance rate downward.

Second, Steensma also believes, more broadly, that the desistance rate at one clinic is just that: the desistance rate at one clinic. Like other knowledgeable researchers, he thinks all sorts of factors, from a given clinic’s diagnostic and treatment protocols to the broader culture in which it is enmeshed (surely the cultural mores surrounding gender nonconformity in and around Amsterdam differ from those in various parts of the United States, for example), affect the likelihood of gender dysphoria dissipating over time. So Steensma is wary of people pointing to the desistance estimate produced by one clinic at one particular moment in time and saying “That’s the desistance rate” — not of the idea of people using his study as evidence for the claims that desistance is pretty common.

Since I was emailing Steensma about this stuff anyway, I figured I’d try to clear up as much of this confusion as possible in one fell swoop. I asked him yesterday if he agrees with these four statements, drawn from our correspondence since last year and from my reporting on this subject:

1. The 2013 Steensma study shouldn't be used as an estimate of the overall rate of desistance because it wasn't designed for that purpose
2. Steensma believes that the 2013 study probably underestimates the desistance rate [in that clinic at that time] because of the age restriction
3. Steensma also believes that this study, combined with his clinical observations, combined with the other studies of desistance, suggest that desistance among kids with genuine gender dysphoria is pretty common, even if we don't have a precise estimate for it
4. Steensma believes that there is a pretty urgent need for other clinics, in other countries, to do this sort of research, given that almost all the recent studies come out of Toronto or Amsterdam

Steensma replied that yes, he agrees with those four statements.

V. Wrapping Up: It’s Hard to Report on Gender Dysphoria

It’s not good that Science Vs, looking out over four fairly recent studies that all offer solid evidence for the proposition that gender dysphoria desistance is at least pretty common, decided to disseminate the view that we can basically ignore these studies because they were so diagnostically muddled, despite the fact that they weren’t. It’s also not good that Science Vs, in examining the changes from the DSM-IV to the DSM-5, told its listeners that “being trans” used to be a mental health condition but now isn’t anymore, given that that isn’t true, either.

I don’t think it’s any mystery what happened here: If you’re a journalist reporting on transgender issues for the first time, you will likely come into contact with fairly politicized sources who will yank you in the wrong direction on key questions about the DSM, what we know about gender dysphoria, and so on. I know that the hip thing to say here is something like, “Well, everything’s politicized, so what does that even mean?,” but I disagree with that. To selectively portray a bunch of studies that measured gender dysphoria as having not really measured gender dysphoria, because you don’t want people to take those studies seriously, because you don’t want people to talk about gender dysphoria desistance — that’s politicization. To spread an inspiring storyline about the APA coming to its senses and depathologizing “being trans” — a storyline that doesn’t map onto anything that happened in reality, but which helps promote the ostensibly important idea that being trans is similar, in certain respects, to being gay — that’s politicization.

Oftentimes, I see these sorts of critiques caricatured as, “Oh, so you’re saying the world is divided neatly between perfect scientists and evil, scheming activists?” I hope I’m not giving that impression. For one thing, plenty of scientists have been deadly wrong about key issues, including in their treatment of trans and gender nonconforming people. There’s a long, ugly history there, and that’s part of the reason some trans people and activists instinctively distrust certain findings, especially findings that can be weaponized against them. There’s nothing irrational about that. For another thing, activists do clearly vital work across many areas, including, when it comes to mental and physical healthcare, advocating for reforms to certain biased or harmful research and clinical practices. I’ve stopped using the 80% desistance statistic, for example, in large part due to the important work of trans activists and writers and researchers, including some of the ones I cited in this post and believe to have taken the wrong stance on the broader question of desistance.

But the fact remains that there are some episodes in which some social-justice-minded activists (and, increasingly it seems, journalists) end up misunderstanding or actively distorting scientists’ good-faith efforts to get at the truth. People should be aware this can happen. In much the same way it would be an error to always instinctively trust scientists over activists when the two groups come to blows, it would be the same mistake, in the other direction, to interpret any such conflict as pitting “science bros” (or whatever) against infallible activists whose every critique of scientific findings is correct.

Plenty has been written about these sorts of unfortunate dustups between social-justice advocacy and sound scientific research, and how these fights sometimes skew the way certain types of findings are presented to the world, or which findings are presented to the world at all. I’d highly recommend anyone who has found this post at all interesting read Alice Dreger’s 2015 book “Galileo's Middle Finger: Heretics, Activists, and the Search for Justice in Science,” which covers both a major gender-dysphoria-related controversy and others, and also the Northwestern University psychology professor Alice Eagly’s excellent 2016 paper “When Passionate Advocates Meet Research on Diversity, Does the Honest Broker Stand a Chance?”

As for this particular beat, I think that improving things will involve not just journalists developing a much deeper familiarity with the extant research, but also interviewing a wider range of people. I’d argue that any journalist who wants to cover gender dysphoria has to interview both a bunch of practicing clinicians — not just the same three celebrity ones, ideally, but also ones who don’t get interviewed as much (they tend to tell very different stories) — and at least a few desisters and detransitioners. With regard to these latter two groups, I promise you: This is not “ex-gay” redux. These people have very important stories to tell that reveal holes in the healthcare system and in our present understanding of gender dysphoria. And I’m worried that unless the truly sorry state of gender-dysphoria science communication improves, ten years from now there will be a lot more detransitioners. That would be bad: There is no one involved in this conversation, at any level, who wants the number of future detransitioners to be higher rather than lower.

A final note that I always try to include when I write about this subject: I mostly critique mainstream gender-dysphoria journalism, which predominantly comes from liberal outlets, because I am much closer, socially and professionally and ideologically, to mainstream liberal outlets than I am to, say, Breitbart or The Federalist. Conservative outlets and organizations make serious errors of their own here — they repeat overestimated desistance rates (90% and up, I sometimes see!) or dumb and cruel applause lines about the folly of “enabling the delusions” of trans people (read: enabling them to live decent lives by transitioning and alleviating their gender dysphoria). Suffice it to say I’m strongly opposed to right-of-center journalism that misrepresents gender dysphoria research, too. I just don’t think I can exert quite as much influence there, because there’s a pretty big values gap.

Questions? Comments? [I usually put a joke-question here but I’m so wiped from writing this that I’ve got nothing]? I’m at singalminded@gmail.com, or on Twitter at @jessesingal.

Matt_410

Oct 28, 2022

This is a super informative. It’s very hard to find non-partisan reviews of this literature. Thank you for this!

Expand full comment

Emmett Flynn

Jan 21, 2024

I know I'm coming to this pretty late, but I'm in a discussion/debate with a trans friend of mine who's quite dismissive of the desistance literature for exactly the issues that you cite, so this has been immensely helpful. I've been reading into some of Zucker's literature from the 2000s to try and see if he's trying to address this, and it does seem that he spends a lot of time addressing it. Seeing as he was the head of the working group that, among other things, crafted the DSM-5 criteria for GD, I'd say that him and those who associate with him (also in working group) have heard the criticism and are trying to make further research more robust by tightening up the criteria. I wouldn't be surprised if the desistance rate was still around 60%, but hopefully we'll see in future studies.

Also an issue with tightening up the criteria is denying more genuinely trans kids recommendations for their insurance to cover costs associated with transition pre-puberty, so there is a trade off between getting more accurate desistance measurements and delivering trans healthcare to those who need it which we may or may not hear about. But seeing as I'm a watchful waiting type person myself, perhaps this isn't that bad.

Singal-Minded

Discussion about this post