Thank you for this. I read the entire Yale report a couple of months ago. As a trained researcher (Ph.D. Vanderbilt, Clinical Psychology; Professor of Psychology for 30 years), I was appalled.
A major theme I read was that we should accept low-quality research because there isn't high-quality research.
Be interested to see if Jesse has the same impression. Can't wait for part 2.
Thank you for this. I read the entire Yale report a couple of months ago. As a trained researcher (Ph.D. Vanderbilt, Clinical Psychology; Professor of Psychology for 30 years), I was appalled.
A major theme I read was that we should accept low-quality research because there isn't high-quality research.
Be interested to see if Jesse has the same impression. Can't wait for part 2.
Of one thing we can be sure, and it’s that trans activists and their allies will fight hammer and tongs to block the production of high quality evidence and, if such evidence should ever emerge, they will fight tirelessly to discredit it and the researchers.
Is it typical for researchers to not answer questions about their white paper? In peer-based scientific literature, I’ve always had to answer questions about my research.
I have never heard of such a thing. It is an enterprise which is supposed to be 100% transparent. Your approach displays the integrity that I have always experienced with research, starting with my days of being trained.
HI...perhaps you can answer this for me......For research to be "high-quality", Must you have randomized tests of patients? I have been trying to understand the difference between low-quality and high quality research.
I ask because trans activist Erin Reed wrote the following:
"Sapir and other far-right news outlets claimed that the ASPS had “broken consensus with other major medical organizations on transgender care” by stating that evidence surrounding gender-affirming surgeries for transgender youth is “low quality.” This term, used in a technical context, refers to the lack of blinded clinical trials or other intensive forms of study that may not be feasible, rather than the colloquial meaning of "poor quality." "
So first of all, you can be better than low quality with either moderate or high quality evidence.
You do not need a clinical trial to be blinded to get high or moderate quality evidence. Blinded is nice but of course you aren't going to do that for these interventions which have observable physical effects. So asking for blinded would not make sense. That is a straw man that keeps being brought up. They keep saying these studies, which are not being called for, can't be done. Ok, do the studies that can be done! Or, I know, show us the outcomes for all the things Chen et al 2023 said it would measure, how about that? How about for longer than 2 years?
It is true that tandomized controlled helps, as people might change over time absent intervention. So picking people randomly for medical intervention vs say psychological support would be more informative then letting people pick which of the two they get and then following them, as their choices for treatment might depend on things which are relevant for how they do with a given intervention.
Relevant also for controlled studies--for gender dysphoria no one knows the "natural history" for adolescents, i.e. whether they will most likely outgrow it (and what that depends upon in the person). For childhood onset they most likely outgrow it without social or medical transition, or used to...
So you don't know what will happen if you intervene, or if you don't. The evidence is inadequate.
"Relevant also for controlled studies--for gender dysphoria no one knows the "natural history" for adolescents, i.e. whether they will most likely outgrow it (and what that depends upon in the person). For childhood onset they most likely outgrow it without social or medical transition, or used to..."
I thought it was the reverse for who would outgrow it - that childhood onset was more likely to persist and adolescent onset was more likely to be transient.
Maybe it's - "childhood onset that persists to adolescence" vs "adolescent onset"
Actually, A.D., you are correct on every score in what you are saying. I was trained in the early 70s that there is a group of individuals, mostly boys who knew from day one that they were girls. Nobody reasonable doubts that. Most did not outgrow it. And they were psychologically healthy.
But that's not the case any longer. Most cases are adolescent onset, girls, and kids with multiple psychiatric problems.
So, your last sentence fits with the research and with my training 50 years ago.
Hey thanks for the sources. I only had time to read one of them, and are heading out for the day.
Which one of the sources did you get your data about 80% from. I want to be sure to read that one first. I'll get to all of them later in the day or tomorrow. Thanks again.
There is also a peer reviewed article (Ristori & Steensma, 2016) but also a recent paper summarizes a lot of them and is not paywalled: Singh, Bradley, Zucker, 2021.
It is one of the studies in the earlier summaries, it's just it was originally a thesis and now there is a peer reviewed paper with a review of earlier literature that is more current.
I tried reading it, but got lost in the terminology of biphilic/androphilic and all of the other phrases like this. Couldn't make heads or tails of the article, but thanks for sending it my way. Ordinarily I enjoy reading research, even research that contradicts my beliefs, but this one was over the top for me.
ah darn, sorry! The philia stuff is whether they are gay or not?
Here:
"In the Wallien and Cohen-Kettenis (52) study, the DSM-III-R criteria were used to diagnose GID. Of the 12 persisters, all met the criteria for GID at the time of the baseline assessment; in contrast, only 68% of the 47 desisters met the criteria for GID; the remainder were deemed subthreshold for the diagnosis. Thus, in their study, the threshold-subthreshold distinction appears to have been an important one in predicting outcome; nonetheless, it should be noted that 68% of the desisters had been threshold for the diagnosis in childhood—perhaps a strong rebuttal to the No True Scotsman argument. In Steensma et al. (51), the DSM-IV-TR criteria were used. Of the 23 persisters, 21 (91.3%) met the criteria for GID; in contrast, only 22 (39.3%) of the 56 desisters were threshold for the diagnosis, suggesting an even more substantial difference in the threshold-subthreshold distinction than was found in Wallien and Cohen-Kettenis. Although the latter percentage was lower than what was found in Wallien and Cohen-Kettenis, that almost 40% of the desisters met the criteria for GID in childhood still argues in favor that the children were desisting from something.6
From Wallien and Cohen-Kettenis (52) and Steensma et al. (51), one predictor of outcome, therefore, was the distinction between being threshold or subthreshold for the GID diagnosis in childhood. Dimensional measures of gender-variant behavior have also proven useful. In both Wallien and Cohen-Kettenis and Steensma et al., dimensional measures of sex-typed behavior in childhood also significantly discriminated between the persisters and desisters, with the former group having, on average, more severe gender-variant behavior at the time of the childhood assessment. Steensma et al. found two other predictors of persistence: boys who were assessed at an older age and boys who had made either a partial or complete gender “social transition” [see (68–70)]. Of the 12 boys who had partially or completely transitioned prior to puberty, 10 (83.3%) were classified as persisters. In contrast, of the 67 boys who had not socially transitioned, only 13 (19.4%) were classified as persisters."
and from the abstract for this study in particular, of 139: " Of the 88 participants who met the full diagnostic criteria for GID in childhood, 12 (13.6%) were classified as persisters and the remaining 76 (86.4%) were not. Of the 51 participants who were subthreshold for the GID diagnosis in childhood, 5 (9.8%) were classified as persisters and the remaining 46 (90.2%) were not. "
You went to all of this work to give an explanation? that really starts my day off right.
It appears I have been wrong. that.......uh........people have learned more since I was trained (55 years ago) or retired (almost 20 years ago). The thing is...I have always liked learning.
So I thank you. It's nice being on substack because you get to interact with people who want to talk instead of fight.
"Assessed at an older age" - does that mean they only first presented at an older age or that, regardless of when they first presented symptoms of GID they were only assessed in this study at an older age?
It's heartening to see further criteria by which the group can be subdivided accurately. Providing transition to support to the sub-group that would persist 80% of the time is very different than providing it to the group that would persist only 20% of the time.
80% was my calculation, based on adding all the individuals' outcomes in all the studies. Of course, the groups and methodolgies of the studies vary, so they're not quite commensurable - but it's a ballpark figure.
Hi Barney. Could not get ahold of the actual citations (behind paywalls), but I did read all of the shortened versions.
We may have a disagreement about what these show. Let me explain:
1. They seem non-representative of studies in the area. Some are quite old, in fact.
2. Even with that, they seem to show that people with more evidence of GID in childhood have much more of a likelihood to have it persist into adulthood.
Don't know if it is possible, but if you could give me your analysis that came up with an estimate of 80% I am totally willing to look at that. I desperately delve into research, and am always up for learning.
Some of those studies I might be inclined to be skeptical of as either being too recent (if there's been a social push to question might that push numbers higher) or too far back (if there was a lot of stigma with being trans, might that push things too far the other way)
(And another one is about "effeminate behavior" not "dysphoria".)
At least based on the dates there, this one:
Wallien, M. S. C., & Cohen-Kettenis, P. T. (2008). Psychosexual outcome of gender-dysphoric children. Journal of the American Academy of Child and Adolescent Psychiatry, 47, 1413–1423.
Lists:
trans-21/54
cis-33/54
(It does not further break down gay/lesbian)
That's got a higher percentage, but still < 50%
I have not read that paper, it stood out to me both as a good year(late enough for gay rights early enough before the recent rise), the highest percentage, and definitely (based on the title) about dysphoria.
If I am understanding your question correctly, let me try to give my views on this.
The Gold standard is double blind completely randomized studies. Those are probably impossible in this area. But that does not mean that studies that are, essentially, correlational (which many many in the area are) simply get elevated to being high quality, and conclusive statements are made about their correlational findings. No. That's inappropriate in social science.
We also don't know how many followup, correlational studies from these clinics are NOT reported because they don't support the views of the clinic.
I have also seen studies that seem, very clearly, to rely on "p-hacking" to get their results. In other words, they vary the internal structure of their study to get significant results. This kind of thing results in totally erroneous statistical conclusions. Those are low quality.
There are other markers of low quality in this area. Often, in the studies I have read, there is no indication of whether all of the youth who had, for example, transitioned from the particular clinic were followed. How representative are the samples studied of the people who were treated at a particular clinic. When that is the case, how do you "mark" those who drop out? Is there a bias or lack of bias in the people who are followed?
Most of the studies seem to be from gender clinics (I'm sure there are exceptions). Cass, for example, had no investment in the findings of her review.
I remember that in the "Dutch Protocol", the seminal study of gender care, the beginning N was 70 and the ending N was 55.
So it lost a bit over 20% of participants but I don't think the study can say why. (Actually, 1 of the 55 persons died from an infection from vaginoplasty since I believe they had been on puberty blockers and had a micro-penis so insufficient flesh to work with so rectal tissue was used, I believe.)
I think the 1 who died was one of the 15 lost between going from 70 to 55. They checked the 55 1 year after surgery and that poor kid passed away soon after surgery, so...."lost to follow-up".
Thank you for this. I read the entire Yale report a couple of months ago. As a trained researcher (Ph.D. Vanderbilt, Clinical Psychology; Professor of Psychology for 30 years), I was appalled.
A major theme I read was that we should accept low-quality research because there isn't high-quality research.
Be interested to see if Jesse has the same impression. Can't wait for part 2.
Of one thing we can be sure, and it’s that trans activists and their allies will fight hammer and tongs to block the production of high quality evidence and, if such evidence should ever emerge, they will fight tirelessly to discredit it and the researchers.
That’s what I thought too. It reeks of activism and not science, which is Jesse’s point.
Is it typical for researchers to not answer questions about their white paper? In peer-based scientific literature, I’ve always had to answer questions about my research.
I have never heard of such a thing. It is an enterprise which is supposed to be 100% transparent. Your approach displays the integrity that I have always experienced with research, starting with my days of being trained.
HI...perhaps you can answer this for me......For research to be "high-quality", Must you have randomized tests of patients? I have been trying to understand the difference between low-quality and high quality research.
I ask because trans activist Erin Reed wrote the following:
"Sapir and other far-right news outlets claimed that the ASPS had “broken consensus with other major medical organizations on transgender care” by stating that evidence surrounding gender-affirming surgeries for transgender youth is “low quality.” This term, used in a technical context, refers to the lack of blinded clinical trials or other intensive forms of study that may not be feasible, rather than the colloquial meaning of "poor quality." "
https://www.erininthemorning.com/p/fact-check-asps-did-not-break-consensus
This is a red herring.
There are 4 levels of grade and you can move between them based upon how the study is done (e.g. indirectness, or strong dose-response effect). There is a handy chart here: https://www.jclinepi.com/article/S0895-4356(10)00332-X/fulltext table 3.
So first of all, you can be better than low quality with either moderate or high quality evidence.
You do not need a clinical trial to be blinded to get high or moderate quality evidence. Blinded is nice but of course you aren't going to do that for these interventions which have observable physical effects. So asking for blinded would not make sense. That is a straw man that keeps being brought up. They keep saying these studies, which are not being called for, can't be done. Ok, do the studies that can be done! Or, I know, show us the outcomes for all the things Chen et al 2023 said it would measure, how about that? How about for longer than 2 years?
It is true that tandomized controlled helps, as people might change over time absent intervention. So picking people randomly for medical intervention vs say psychological support would be more informative then letting people pick which of the two they get and then following them, as their choices for treatment might depend on things which are relevant for how they do with a given intervention.
Relevant also for controlled studies--for gender dysphoria no one knows the "natural history" for adolescents, i.e. whether they will most likely outgrow it (and what that depends upon in the person). For childhood onset they most likely outgrow it without social or medical transition, or used to...
So you don't know what will happen if you intervene, or if you don't. The evidence is inadequate.
Thank you...
"Relevant also for controlled studies--for gender dysphoria no one knows the "natural history" for adolescents, i.e. whether they will most likely outgrow it (and what that depends upon in the person). For childhood onset they most likely outgrow it without social or medical transition, or used to..."
I thought it was the reverse for who would outgrow it - that childhood onset was more likely to persist and adolescent onset was more likely to be transient.
Maybe it's - "childhood onset that persists to adolescence" vs "adolescent onset"
Actually, A.D., you are correct on every score in what you are saying. I was trained in the early 70s that there is a group of individuals, mostly boys who knew from day one that they were girls. Nobody reasonable doubts that. Most did not outgrow it. And they were psychologically healthy.
But that's not the case any longer. Most cases are adolescent onset, girls, and kids with multiple psychiatric problems.
So, your last sentence fits with the research and with my training 50 years ago.
Seems that about 80% of even those children "who knew from day one" that they were the opposite sex actually *do* outgrow it. http://www.sexologytoday.org/2016/01/do-trans-kids-stay-trans-when-they-grow_99.html
Sissy boys and tomboy girls grow up again and again to be gay men and lesbians. I know, I am one of them.
This is why to me "gender affirming care" is the gay/lesbian conversion therapy from hell.
BTW, I explain all this to other gay men and they get angry and accuse of transphobia, etc. Just totally clueless and don't want to know either.
Hey thanks for the sources. I only had time to read one of them, and are heading out for the day.
Which one of the sources did you get your data about 80% from. I want to be sure to read that one first. I'll get to all of them later in the day or tomorrow. Thanks again.
There is also a peer reviewed article (Ristori & Steensma, 2016) but also a recent paper summarizes a lot of them and is not paywalled: Singh, Bradley, Zucker, 2021.
It is one of the studies in the earlier summaries, it's just it was originally a thesis and now there is a peer reviewed paper with a review of earlier literature that is more current.
I tried reading it, but got lost in the terminology of biphilic/androphilic and all of the other phrases like this. Couldn't make heads or tails of the article, but thanks for sending it my way. Ordinarily I enjoy reading research, even research that contradicts my beliefs, but this one was over the top for me.
ah darn, sorry! The philia stuff is whether they are gay or not?
Here:
"In the Wallien and Cohen-Kettenis (52) study, the DSM-III-R criteria were used to diagnose GID. Of the 12 persisters, all met the criteria for GID at the time of the baseline assessment; in contrast, only 68% of the 47 desisters met the criteria for GID; the remainder were deemed subthreshold for the diagnosis. Thus, in their study, the threshold-subthreshold distinction appears to have been an important one in predicting outcome; nonetheless, it should be noted that 68% of the desisters had been threshold for the diagnosis in childhood—perhaps a strong rebuttal to the No True Scotsman argument. In Steensma et al. (51), the DSM-IV-TR criteria were used. Of the 23 persisters, 21 (91.3%) met the criteria for GID; in contrast, only 22 (39.3%) of the 56 desisters were threshold for the diagnosis, suggesting an even more substantial difference in the threshold-subthreshold distinction than was found in Wallien and Cohen-Kettenis. Although the latter percentage was lower than what was found in Wallien and Cohen-Kettenis, that almost 40% of the desisters met the criteria for GID in childhood still argues in favor that the children were desisting from something.6
From Wallien and Cohen-Kettenis (52) and Steensma et al. (51), one predictor of outcome, therefore, was the distinction between being threshold or subthreshold for the GID diagnosis in childhood. Dimensional measures of gender-variant behavior have also proven useful. In both Wallien and Cohen-Kettenis and Steensma et al., dimensional measures of sex-typed behavior in childhood also significantly discriminated between the persisters and desisters, with the former group having, on average, more severe gender-variant behavior at the time of the childhood assessment. Steensma et al. found two other predictors of persistence: boys who were assessed at an older age and boys who had made either a partial or complete gender “social transition” [see (68–70)]. Of the 12 boys who had partially or completely transitioned prior to puberty, 10 (83.3%) were classified as persisters. In contrast, of the 67 boys who had not socially transitioned, only 13 (19.4%) were classified as persisters."
and from the abstract for this study in particular, of 139: " Of the 88 participants who met the full diagnostic criteria for GID in childhood, 12 (13.6%) were classified as persisters and the remaining 76 (86.4%) were not. Of the 51 participants who were subthreshold for the GID diagnosis in childhood, 5 (9.8%) were classified as persisters and the remaining 46 (90.2%) were not. "
You went to all of this work to give an explanation? that really starts my day off right.
It appears I have been wrong. that.......uh........people have learned more since I was trained (55 years ago) or retired (almost 20 years ago). The thing is...I have always liked learning.
So I thank you. It's nice being on substack because you get to interact with people who want to talk instead of fight.
"Assessed at an older age" - does that mean they only first presented at an older age or that, regardless of when they first presented symptoms of GID they were only assessed in this study at an older age?
It's heartening to see further criteria by which the group can be subdivided accurately. Providing transition to support to the sub-group that would persist 80% of the time is very different than providing it to the group that would persist only 20% of the time.
80% was my calculation, based on adding all the individuals' outcomes in all the studies. Of course, the groups and methodolgies of the studies vary, so they're not quite commensurable - but it's a ballpark figure.
Hi Barney. Could not get ahold of the actual citations (behind paywalls), but I did read all of the shortened versions.
We may have a disagreement about what these show. Let me explain:
1. They seem non-representative of studies in the area. Some are quite old, in fact.
2. Even with that, they seem to show that people with more evidence of GID in childhood have much more of a likelihood to have it persist into adulthood.
Don't know if it is possible, but if you could give me your analysis that came up with an estimate of 80% I am totally willing to look at that. I desperately delve into research, and am always up for learning.
thanks!
Jesse has written about this in detail in the past… even before he had a Substack:
https://medium.com/@jesse.singal/everyone-myself-included-has-been-misreading-the-single-biggest-study-on-childhood-gender-8b6b3d82dcf3
And
https://jessesingal.substack.com/p/how-science-vs-accidentally-invented
Interesting.
Some of those studies I might be inclined to be skeptical of as either being too recent (if there's been a social push to question might that push numbers higher) or too far back (if there was a lot of stigma with being trans, might that push things too far the other way)
(And another one is about "effeminate behavior" not "dysphoria".)
At least based on the dates there, this one:
Wallien, M. S. C., & Cohen-Kettenis, P. T. (2008). Psychosexual outcome of gender-dysphoric children. Journal of the American Academy of Child and Adolescent Psychiatry, 47, 1413–1423.
Lists:
trans-21/54
cis-33/54
(It does not further break down gay/lesbian)
That's got a higher percentage, but still < 50%
I have not read that paper, it stood out to me both as a good year(late enough for gay rights early enough before the recent rise), the highest percentage, and definitely (based on the title) about dysphoria.
If I am understanding your question correctly, let me try to give my views on this.
The Gold standard is double blind completely randomized studies. Those are probably impossible in this area. But that does not mean that studies that are, essentially, correlational (which many many in the area are) simply get elevated to being high quality, and conclusive statements are made about their correlational findings. No. That's inappropriate in social science.
We also don't know how many followup, correlational studies from these clinics are NOT reported because they don't support the views of the clinic.
I have also seen studies that seem, very clearly, to rely on "p-hacking" to get their results. In other words, they vary the internal structure of their study to get significant results. This kind of thing results in totally erroneous statistical conclusions. Those are low quality.
There are other markers of low quality in this area. Often, in the studies I have read, there is no indication of whether all of the youth who had, for example, transitioned from the particular clinic were followed. How representative are the samples studied of the people who were treated at a particular clinic. When that is the case, how do you "mark" those who drop out? Is there a bias or lack of bias in the people who are followed?
Most of the studies seem to be from gender clinics (I'm sure there are exceptions). Cass, for example, had no investment in the findings of her review.
Thank you for your response.
I remember that in the "Dutch Protocol", the seminal study of gender care, the beginning N was 70 and the ending N was 55.
So it lost a bit over 20% of participants but I don't think the study can say why. (Actually, 1 of the 55 persons died from an infection from vaginoplasty since I believe they had been on puberty blockers and had a micro-penis so insufficient flesh to work with so rectal tissue was used, I believe.)
In a retrospective study that included the cohort of the original Dutch Protocol kids, about 20% discontinued identifying as the opposite gender too.
https://doi.org/10.1093/jsxmed/qdad062.088
Thanks for this study. I just read it, with interest.
You might want to read Abbruzzese et al, The Myth of reliable research in pediatric gender medicine... (2023).
There is also a paper by Biggs on the history of the Dutch protocol which talks about the younger kids a lot, too.
I think the 1 who died was one of the 15 lost between going from 70 to 55. They checked the 55 1 year after surgery and that poor kid passed away soon after surgery, so...."lost to follow-up".
I had read about the 70 down to 55 issue, but didn't know about the rest. Thanks!