A New Subclade of Y Haplogroup J2b

 

 

T. Whit Athey and Bonnie E. Schrack

 

 

Abstract

 

Evidence from case studies is presented that allows the definition of a new subclade of Y Haplogroup J2b-M102.  The defining binary polymorphism of the new subclade is a deletion event in the Y-STR marker, DYS455.  The present study shows that the new subclade is located in the J2b phylogeny downstream from the previously defined subclade J2b-M241. The members of the subclade were asked about Ashkenazi and Cohanim backgrounds on their patrilineal lines and of the four who responded, all four are Ashkenazi Jews and Cohanim.  This suggests that future Y-chromosome studies involving Ashkenazi and Cohanim should type DYS455 to investigate further this possible link.

 

 

 

Address for correspondence:  Whit Athey, wathey@hprg.com

 

Received:  January 9, 2008; accepted:  February 27, 2008

 

 

 

Introduction

 

About one-fifth of Ashkenazi Jews are members of Y-Haplogroup J2 (Behar, 2004).  Previous studies of Ashkenazim did not employ sufficient resolution to understand exactly where the Jewish groups were located in the J2 phylogeny, or if there were any subclades of J2 that were almost exclusively Jewish.  The present study focuses on one small part of Haplogroup J2 and demonstrates that a new subclade of Haplogroup J2b-M12 may be defined.  All members of the subclade who have been identified so far are Ashkenazi and Cohanim, though the sample is small.

 

Figure 1 provides a simplified overview of Haplogroup J2 (ISOGG, 2007) and shows the location and phylogenetic structure of Haplogroup J2b.

 

 

 

 

 

 

Figure 1.  Simplified phylogenetic chart for Haplogroup J2 according to ISOGG (2007).  Text in red font indicates defining binary polymorphisms.

 

 

It is probable that the binary polymorphism (BP) M102 may be redundant with M12, but previous publications of Haplogroup J phylogenetic trees, including that of ISOGG (2007), still show these BPs as reflected in Figure 1.  It is likely that there will be other changes to the 2008 version of the ISOGG tree, necessitated by the publication of new results during the past year.

 

Within Haplogroup J2b1-M102, there are presently two subclades, J2b1a defined by M241 and J2b1b defined by M205 (ISOGG, 2007).  Note that some phylogenetic trees have the position of M241 and M205 interchanged, so in this article, the defining BP will be used to make it clear which subgroup is intended.  On the basis of limited data, J2b1a-M241 appears to be the larger of the two subgroups (Sengupta 2006).  Several customers of Family Tree DNA (FTDNA, Houston, TX) who are M102+ (in J2b1) have the unusual value of 8 on the Y-STR marker, DYS455.

 

The Y-STR marker DYS455 consists of a poly-(AAAT) sequence with the great majority of men having a repeat value of 11.  A repeat value of 8 on DYS455 has previously been observed only in members of Haplogroup I1a, where the value resulted from a three-repeat deletion event that occurred close to the founding event for Haplogroup I1a, because thus far no members of I1a have failed to exhibit the deletion event.  The deletion is stable within Haplogroup I1a with only about 2% of members having mutated to values of 7 or 9 in the 10,000+ year history of the haplogroup.

 

Those J2b members with DYS455=8 have apparently also inherited this value from a common ancestor who, given the similarity in their Y-STR profiles, probably lived less than one thousand years ago.

 

Because DYS455=8 occurs only in Haplogroup I1a and in the J2b subclade that we are presently considering, it is quite probable that the three-repeat deletion event in DYS455 has occurred just twice in human history.  Alternately, if it has occurred more often, only two of the founders have living descendants.  In the present study we show that the deletion event in a J2b founder has allowed us to define a new subclade within Haplogroup J2b.  Testing of binary polymorphisms (BPs) in several subjects demonstrates that the new subclade is located within Haplogroup J2b-M241, though the exact relationship to the other subgroups of J2b-M241 is still uncertain, as discussed in more detail below.  We have named the new subclade, J2b1a4, though revisions to the J phylogenetic tree will likely cause the name to change.

 

Methods

 

Four subjects were recruited who had previously been tested and found to be M205+, along with four subjects who had previously been tested as M241+.  Two of these eight subjects had been tested on both M241 and M205.  Four subjects who were M102+ and who had DYS455=8 were also recruited.  Two of the latter two subjects had also previously been tested on M205 at FTDNA and had been found to be negative.  These two subjects were then tested by Ethnoancestry for M241.  One of these has also been tested on two SNPs, M99 and M280, that define subgroups of J2b-M241.  Four other subjects with DYS455=8, but who have no BP results, were also included in the study for comparison on Y-STR values.  These twelve participants in the present study were designated as Subjects 101-112.

 

The public databases of the Sorenson Molecular Genealogy Foundation (SMGF), Y-Search, and Y-Base were searched for occurrences of DYS455=8.  Those four that also appeared to be members of Haplogroup J2b, according to the haplogroup prediction program of Athey (2005, 2006), were included in the study as Subjects 113-116, in order to further characterize the Y-STR values of the new subclade.

 

For the present study, testing of M241 was carried out on selected subjects by Ethnoancestry (Cyprus, CA).  Testing of the BPs M99 and M280 was carried out previously at FTDNA.  At Ethnoancestry, Y chromosome BPs are amplified by PCR with standard primers giving products from 200 to 500 bp in length.  PCR products are then sequenced using dye terminator chemistry with electrophoresis on a capillary ABI sequencer.  Alleles are called in the software package, Sequencher, by alignment with chromosomes of known allelic state (positive and negative controls).  The procedures followed at FTDNA have not been published, but they are probably similar.

 

Results

 

The results for each subject on several BPs are shown in Table 1.  These results can be used to locate the DYS455 deletion event within the phylogeny of Haplogroup J2b.  Because the detailed logic model for placing a new BP within an existing two-branch /phylogeny is not usually included in research articles, it will be discussed in some detail here.  Following the discussion of the placement of the new subclade, this article will also discuss in some detail, the issue of when it is appropriate to use a deletion in a Y-STR marker for phylogenetic purposes, and then apply the resulting principles to the present case.

 

 

 

 

In the following discussion and in Table 1, the BP representing the deletion event in DYS455 will be designated as “455del.”  A person who has a value of 8 on DYS455 (and rarely 7 or 9) will be termed 455del+, while a person who has the value of 11 (or rarely 10 or 12) will be termed 455del-.

 

Placement of the New Subclade Within J2b1

 

All of the subjects in the present study who have DYS455=8 and test results for M102 were found to be M102+.  Therefore, the placement of the new clade focused on the structure within J2b1-M102.  Figure 1 shows that the structure of J2b1-M102 consists of two subgroups defined by M241 and M205.  In the following, we assume that this basic structure is correct—that neither M205 nor M241 is upstream or downstream from the other, and that they define parallel branches as shown.  The new subclade defined by DYS455=8 cannot be parallel to the other two branches since at least two of its members are also M241+.

 

Therefore, we now develop a logic model for placing a new subclade within this or a similar structure.  The approach is to lay out all of the possible placements for the BP 455del.  In most cases, only a single counter example is necessary to disprove a potential placement, though a second confirming case is helpful in ruling out lab errors.  If all possible placements except for one can be disproved, then the remaining one is the correct one.  Looking only for positive evidence for the most likely placement would require a large number of subjects to provide reasonable assurance of the placement being correct.  The present approach is efficient in that only a small number of well-chosen subjects will suffice to place the new subclade.

 

a.  In the following discussion, Figure 2 outlines all of the possible placements of the new subclade of J2b1 defined by 455del.  We will consider each possibility in turn, and show what would be necessary to negate or disprove each possibility.  The possibility remaining after this process of elimination, the one that cannot be rejected, is the correct placement.

 

Figure 2.  The seven possible phylogenetic locations for the deletion event in DYS455 with respect to M102, M241, and M205.

 

 

 

b.  The 455del is upstream from both M241 and M205 as shown in Figure 2b.

 

In this case, all M241+ and M205+ subjects will also have 455del.  However, if any one M241+ or M205+ subject does not have the 455del, then this possibility is rejected.  In this study subjects 101-104 are M241+ and 105-108 are M205+ and all have 455del- (e.g., they do not have the deletion).  Therefore, this possibility is rejected.

 

c.  455del is redundant with M241 as shown in Figure 2c.

 

To disprove this possibility, we need only to find at least one 455del+ subject who has M241-, or at least one M241+ who has 455del-.  In the present study, subjects 101-104 are all M241+, but none has the deletion.  Therefore, this possibility is rejected.

 

d.  455del is redundant with M205 as shown in Figure 2d.

 

To disprove this possibility, we need only to find at least one 455del+ subject who has M205-, or at least one M205+ subject who has 455del-.  In the present study, subjects 105-108 are all M205+, but none has the deletion.  Therefore, this possibility is rejected.

 

e.  455del is upstream on the branch leading to M241 as shown in Figure 2e.

 

To disprove this possibility, we need only find at least one M241+ subject who is negative for 455del.  This disproof is provided by subjects 101-104, all of whom are M241+ and 455del-.

 

f.  455del is upstream on the branch leading to M205 as shown in Figure 2f.

 

To disprove this possibility, we need only show that there exists at least one M205+ subject who is negative for 455del.  This disproof is provided by subjects 105-108, all of whom are M205+ and 455del-.

 

g.  455del is downstream from M241 as shown in Figure 2g.

 

To disprove this possibility, we would need to find a subject who exhibits the deletion in DYS455, but is M241-.  We found no such results, so this remains an open possibility.

 

h.  455del is downstream from M205 as shown in Figure 2h.

 

To disprove this possibility, we would need to find a subject who exhibits the deletion in DYS455, but is M205-.  Subjects 109 and 112 have 455del+, but are M205-, so they provide this disproof.

 

Summarizing, we have shown that every possibility except for (g) has been disproved.  Since one possibility must be true, it must be (g).  Indeed, subjects 109 and 112 have results that are consistent with 455del being downstream from M241.

 

Placement of the New Subclade Within J2b1a-M241

 

Within Haplogroup J2b1a-M241, there are already three previously known subgroups (ISOGG 2007), so the subclade defined by 455del could be (1) a fourth subgroup in parallel to the existing three subgroups, (2) downstream from one of the three, (3) upstream from one or more of the three, or (4) redundant with one of the three.  In principle, we could use the same kind of logic model and apply it to the placement of 455del within J2b1a-M241.  However, at the present time, only two BPs, M99 and M280, out of the three defining these three subgroups of J2b1a-M241, have commercially available tests, so the exact placement of the new subclade cannot be completely resolved at this time.  Subjects 109 and 112, who have DYS455=8, were found to be negative for M99 and M280, so we can at least be sure that 455del is not downstream from either of these two SNPs.  In other words, the subclade defined by the deletion is not a subgroup of J2b1a1-M99 or J2b1a2-M280.  We were unable to identify any subject who was M99+ or M280+, so we cannot say with certainty that one of these two BPs is not downstream from the deletion event.  Tentatively, the new subclade can be placed as a parallel branch to the three other subgroups of J2b1a-M241 as shown in Figure 3.

 

 

 

Figure 3  Haplogroup J2b1 with the new subclade shown (simplified for clarity).

 

 

 

The Use of DYS455=8 for Phylogenetic Purposes

 

If the deletion in DYS455 occurred in the region flanking the (AAAT)n repeat structure, then it would be considered just another a binary polymorphism like many other indels included on the Y phylogenetic tree.  However, the PCR product for DYS455 has not been sequenced to show the exact nature of the deletion event, so we must assume the worst case—that the deletion occurred entirely within the poly-(AAAT) structure, so that the remaining part of the repeat structure would continue to behave as a normal, albeit truncated, Y-STR marker.  Therefore, we must examine the appropriateness of using this particular deletion in a Y-STR marker for phylogenetic purposes.

 

To be appropriate for use for phylogenetic purposes, the Y-STR marker should have the following characteristics:

 

Criterion 1.  The marker should be quite stable at its new value, with essentially no chance of reverting to its previous value of 11.

 

Criterion 2.  Further mutations in the marker should not produce an allele frequency distribution that overlaps the former distribution.

 

Criterion 3.  The deletion event should be “almost unique,” perhaps occurring no more than a few other times in human history.  There are several BPs on the Y phylogenetic tree that have occurred two or more times in different parts of the tree.  Where BPs are not unique, they must be easily interpretable within the context of the phylogeny.  Absolute uniqueness has never been required of a marker on the Y tree.

 

A final characteristic should be mentioned, though it applies to other types of markers as well—the marker should be useful for phylogenetic purposes.  For example, there is no reason to use the deletion in DYS455 within Haplogroup I because there are already a number of redundant SNPs available to define Haplogroup I1a.  The deletion in DYS455 is useful within Haplogroup J2b because there is apparently not currently a SNP available to define the subclade.

 

Starting with the last criterion, the value DYS455=8 exists also in Haplogroup I1a, but it has not been reported in any other haplogroup.  Therefore, the “almost unique” criterion is met.  Since the modal value within Haplogroup I is also 11, it would appear that a similar three-repeat deletion has occurred, though the marker has not been sequenced in I1a either.  Therefore, we can examine the stability criterion within Haplogroup I1a where a very large amount of data is available.

 

A group of 168 haplotypes that had been tested and found positive for one of the defining BPs for Haplogroup I1a (usually M253) was extracted from public surname project web pages at the FTDNA web site.  Duplicates from the same surname cluster were not included.  166 of the haplotypes had a value of 8 on DYS455, while two had a value of 9.  Another group of 374 haplotypes were extracted from Y-Search where Haplogroup I1a was indicated as the haplogroup, again not counting duplicate surnames, and the DYS455 values for these were also examined.  Only nine of the 374 haplotypes had values different from 8—two haplotypes had values of 7 and seven had values of 9.  Therefore, since we see only a very sharp distribution centered at DYS455=8, in spite of Haplogroup I1a likely being more than 10,000 years old (Rootsi 2004), the marker at a repeat value of eight appears to be remarkably stable.  In Haplogroup I1b-S31, DYS455=11 is the modal value, and out of over 300 haplotypes extracted from Y-Search, only one had a value of ten, and one had a value of 12.  Therefore, the allele frequency distribution on DYS455 for the remainder of Haplogroup I does not overlap the data from I1a.  However, if we examined a sample of millions of people, there could a small degree of overlap.

 

Haplogroup J is older than I1a, so we would expect to see a broader distribution of repeat values on DYS455.  As much as approximately 5% of Haplogroup J2a has a value of 10 on DYS455, but no values of 9 have yet been reported.  Among a group of 100 people who were tested or predicted by FTDNA to be in J2b, none had the value 10 and only one had a value of 12.

 

With only a small number of members of Haplogroup J2b-DYS455=8 being thus far identified, it is not surprising that the only repeat value on DYS455 that has been observed is eight.  It can be expected that the allele frequency distribution on DYS455 will be similarly stable and no less sharply distributed about the value of 8 as was found in Haplogroup I1a.

 

Therefore, all three criteria are satisfied and the deletion event in DYS455 may be used for phylogenetic purposes within Haplogroup J2b.

 

It should be noted that this is not the first use of a Y-STR marker for phylogenetic purposes (DiGiacomo 2004; Sengupta 2006, ISOGG 2007, King 2008).

 

Origins of the Subclade

 

There are currently 10 people in the database of FTDNA who have DYS455=8 and also appear to be members of Haplogroup J2b, either through testing of M12 or prediction based upon Y-STR haplotype.  Three others appear to belong to the same Y-STR cluster, but they have not tested DYS455.

 

All of those members of J2b-DYS455=8 who have entries on Y-Search, including Subjects 109, 110, and 112, and who identify their patrilineal family background, indicate Ashkenazi Jewish roots or their earliest ancestors have names associated with Ashkenazim.  Four of these have responded to private inquiries and all four who responded indicated that their lines have traditions of being Cohanim.  It is not yet clear whether there are any other Cohanim within Haplogroup J2b.  However, if our suggestion of a relatively recent founding event for J2b1a-DYS455=8 is correct, there may be other Cohanim lines found within the J2b1-M241 background in which the deletion event in DYS455 occurred.

 

It is also possible that non-Jewish members of the new subclade may eventually be identified.  Additionally, with a larger sample size, we might locate Ashkenazim within the new subclade without a tradition of being Cohanim.  We cannot rule out these possibilities because of the small number of members of the subclade identified so far, and the possibility of selection bias.  In future studies of Ashkenazi Jews, of Cohanim, or of Haplogroup J2b, it would be very helpful if DYS455 were tested, at least in those subjects who are members of Haplogroup J2, so that these questions could be investigated with sufficient numbers to make firmer conclusions.

 

Y-STR Values of J2b-DYS455=8

 

Table 2 shows the Y-STR values for the 16 subjects.  The first eight subjects are in J2b-M102 and the last eight are also members of the new subclade of J2b-M241.  Note that there is limited diversity among the members of the new subclade, implying that it is fairly young.

 

 

 

None of the Subjects 109-112 knows of a genealogical connection to any of the others, and all four have different surnames.  Using Subject 112 as a point of comparison, the time-to-the-most-recent-common-an­cestor (TMRCA) calculator used by FTDNA, called “FTDNATiP,” on the other two subjects who have 37 markers reported (Subjects 109 and 110) results in a 50% probability of a TMRCA at 3 and 6 generations, respectively.  Using only the 25 markers in common to Subjects 112 and 111, the calculator provides a corresponding 50% TMRCA at 12 generations.  Clearly, this is an inadequate sample for precise determination of when the most recent common ancestor lived, but it is also clear that it is very likely that this ancestor lived in the last millennium.

 

Discussion

 

The present study has identified a new subclade of Haplogroup J2b and the subclade has been shown further to be a subgroup of Haplogroup J2b1a-M241.  Precise placement of the new subclade with respect to the previously known subgroups of J2b1a-M241 must await the development of tests for their defining BPs and the identification of subjects who are positive for each of them.

 

Only about a dozen members of the new subclade have been identified so far, and only about half of these have responded to inquiries, but all who responded have indicated Ashkenazi and Cohanim backgrounds for their paternal lines.  The geographic origin of the paternal lines, where known, was most often indicated as Eastern Europe.  The addition of testing of DYS455 in future Jewish DNA studies would be useful, at least for those subjects who are in Haplogroup J2 (i.e., positive for M172, which is almost always tested).

 

As more information is accumulated, the new subclade may be found to include non-Jewish members.  However, with the apparently young age of the subclade—only several hundred years—it is also possible that the deletion event in DYS455 occurred within an Ashkenazi community in Eastern Europe.  If this scenario is correct, then it implies that the deletion occurred in an Ashkenazi Jew who was M241+ and whose father had a value of 11 on DYS455.  There could be other Ashkenazi Jewish lineages that have survived to the present from this ancestral M241+/455del- background.  All of those in the present study who were M241+ and 455del-, reported that they were of northwest European ancestry and had no Jewish roots, so it appears likely that M241+ became established prior to the first M241+ persons becoming identified as Jewish.

 

Semino (2004) found that 1.2% of his Ashkenazi subjects were M102+, which represents about 5% of Ashkenazi J2’s.  The new subclade is too small to include all of these J2b’s.  An interesting question for future studies is, where are the other J2b Ashkenazim in the J2b phylogeny?  Some are probably in J2b-M241*, but we do not know at present if there are significant numbers in J2b-M205.

 

Web Resources

 

http://www.ysearch.org                                        genetic genealogy database

http://www.ybase.org                                           genetic genealogy database

http://www.smgf.org                                            genetic genealogy database

http://home.comcast.net/~hapest5/index.html      haplogroup predictor

 

References

 

Athey TW (2005)  Haplogroup prediction using an allele-frequency approach.  J Genet Geneal, 1:1-7.

 

Athey TW (2006)  Haplogroup prediction from Y-STR values using a Bayesian-allele-frequency approach.  J Genet Geneal, 2:34-39.

 

Behar DM, Thomas MG, Skorecki K, Hammer MF, Bulygina E, Rosengarten D, Jones AL, Held K, Moses V, Goldstein D, Bradman N, Weale ME (2003) Multiple origins of Ashkenazi Levites: Y chromosome evidence for both Near Eastern and European ancestries.  Am J Hum Genet 73:768–779.

 

Di Giacomo F, Luca F, Popa LO, Akar N, Anagnou N, Banyko J, Brdicka R, Barbujani G, Papola F, Ciavarella G, Cucci F, Di Stasi L, Gavrila L, Kerimova MG, Kovatchev D, Kozlov AI, Loutradis A, Mandarino V, C. Mammi C, Michalodimitrakis EN, Paoli G, Pappa KI, Pedicini G, Terrenato I, Tofanelli S, Malaspina P, Novelletto A (2004)  Y chromosomal haplogroup J as a signature of the post-Neolithic colonization of Europe.  Hum Genet 115:357-71.

 

ISOGG--International Society of Genetic Genealogy (2007) Y Haplogroup Tree.  Web Site URL:

http://www.isogg.org/tree/index.html.

 

King RJ, Ozcan SS, Carter T, Kalfoğlu E, Atasoy S, Triantaphyllidis C, Kouvatsi A, Lin AA, Chow CE, Zhivotovsky LA, Michalodimitrakis M, Underhill PA (2008) Differential Y-chromosome Anatolian influences on the Greek and Cretan Neolithic. Ann Hum Genet. 72:205-214

.

 

Rootsi S, Magri C, Kivisild T, Benuzzi G, Help H, Bermisheva M, Kutuev I, Barac L, Pericic M, Balanovsky O, Pshenichnov A, Dion D, Grobei M, Zhivotovsky LA, Battaglia V, Achilli A,  Al-Zahery N, Parik J, King R, Cinnioglu C, Khusnutdinova E, Rudan P, Balanovska E, Scheffrahn W, Simonescu M, Brehm A, Goncalves R, Rosa A, Moisan JP, Chaventre A, Ferak V, Furedi S, Oefner PJ, Shen P, Beckman L, Mikerezi I, Terzic R, Primorac D, Cambon-Thomsen A, Krumina A, Torroni A, Underhill PA, Santachiara-Benerecetti AS, Villems R, Semino O (2004)  Phylogeography of Y-chromosome haplogroup I reveals distinct domains of prehistoric gene flow in Europe.  Am J Hum Genet 75:128-137.

 

Semino O, Magri C, Benuzzi G, Lin A, Al-Zahery N, Battaglia V, Maccioni L, Triantaphyllidis C, Shen P, Oefner PJ, Zhivotovsky LA, King R, Torroni A, Cavalli-Sforza LL, Underhill PA, A Santachiara-Benerecetti S (2004)  Origin, diffusion, and differentiation of Y-Chromosome haplogroups E and J: Inferences on the neolithization of Europe and later migratory events in the Mediterranean area.  Am J Hum Genet 74:1023-1034.

 

Sengupta S, Zhivotovsky LA, King R, Mehdi SQ, Edmonds CA, Chow CT, Lin AA, Mitra M, Sil SK, Ramesh A, Rani MVU, Thakur CM, Cavalli-Sforza LL, Majumder PP, Underhill PA (2006)  Polarity and temporality of high resolution Y-chromosome distribution in India identify both indigenous and exogenous expansions and reveal minor genetic influence of central Asian pastoralists.  Am J Hum Genet , 78:202-221.