MacGregor DNA Project blog update 2018
This year’s blog will be slightly different from those done
previously in that I want to spend a portion of it explaining how genetic
genealogy helps us understand intra-clan relationships but in order to continue
I need to say exactly what a Scottish clan is, as there are some particular
misconceptions concerning the exact nature of clan in the 21st
century. At the end, I go on to discuss
how SNPs are helping to build up a better picture of family groupings within
the main line MacGregor family tree after the mid 14th century.
What is a clan in
Scottish usage?
Six hundred years ago this question was quite simple to
answer. You were associated with a clan if you had been born with the name – in
the MacGregors’ case that might be expressed as Gregor, Grigor, MacGregor,
McGrigor, McGregor and a whole range of alternative spellings such as, for
example, McGreagor (an attempt to render the Gaelic phonetically into English?).
At that time too spelling had not been standardised - so one might find
Mckgregor, M’gregor and so on. You were also a member of the clan if your name
was an accepted variant, such as Grierson, or Grier, Greig/Grieg/Grig etc.
These were considered to be shortened or anglised versions of the main clan
name. So, Grier-son equals Gregor-son and Grier is the same name without the
‘son’ on the end.
Whether or not these accepted names
were genetically related to the main line was not the point, as a clan was a
collection of related surnames. Members of the clan recognised as Chief the
head of the main line (the Chief of the MacGregors for example), and often,
especially in the early days relied on him for protection, or rather, on his
ability to pull a ‘federation’ of individuals together to ensure, usually
armed, protection, or, as a means of seeking retribution on another group for
some offence.
There were others associated with
individual clans. People who belonged to individual septs. Sometimes the same
name would appear in two or more lists of accepted septs of different clans –
such is the case with surname King, among others. Again, some descriptive words
used as surnames were understood to have been borne by people associated with
the clan, and these surnames are found in many clan lists: Bain (or Ban) or its
anglicised equivalent White; Roy meaning Red; Dhu or Dow meaning dark or black,
are some examples.
Finally, there were people who
answered none of these ‘qualification’ but who lived on the land which was
under the Chief’s influence as ‘part-takers’. Grant and Menzies rental
documents of the 18th century reveal instances where individuals
adopted the name of the local chief where formerly they were called only by
their patronymics. A patronymic shows the genealogy of an individual back two
generations – so my patronymic would be Richard McEwan VicPeter: there McEwan
is not a surname but shows that my father was Ewan (not THAT Ewan). On a rental
document you might find John McGregor VicPatrick which means that John’s father
was Gregor and his grandfather Patrick: if John were a very poor inhabitant – a
cottar – it might be that he had lost the knowledge that he was a genetic
MacGregor and so ended up taking the surname Grant. It’s unusual to find this
situation among MacGregors because of their turbulent history but it happens in
other clans. Paradoxically because the MacGregor name was proscribed
[forbidden] for so long [1603-60 and 1693 to 1774] it seems many families held
on to the knowledge that they were MacGregors despite having been forced to
take other surnames. Some families never changed back to MacGregor when it was
finally possible to do so – which is why in the DNA project we see individuals
called Drummond, Stirling, Campbell etc who are genetically MacGregors - their
ancestors never readopted the name when it was safe to do so.
What is clan in the
21st century?
To some the concept of clan in the present century is an
irrelevance. We no longer need the Chief to protect us and we no longer live in
defined communities of inter-related individuals offering support to each other
in the same way. In the 21st century a clan is made up of people who
value the bonds of kinship often promoted by Clan Societies, and they value
the, sometimes surprising, connections that clan association brings. At its
gathering in 2014 the Clan Gregor Society of Scotland had 11 nationalities
represented, some of whom like the Philippine and South African contingents
actually shared a recent common ancestor. Members of a clan still recognise the
Chief as head of the clan, but, as I have explained, it is likely that
relatively few share a common ancestor with him (or her). Most Clan Societies
recognise that finding paper evidence of relationships becomes increasingly
difficult before 1750 and therefore accept both female line connection (e.g. ‘great
grandmother was a MacGregor’, or a King, or a Bain etc), or, that there is a
tradition within a family of MacGregor clan connection.
What does genetic
genealogy tell us?
1)
That
people called by a certain surname are not necessarily descended from a common
ancestor, although as we have found in the DNA project approximately 50% within
a surname subgroup do share a common ancestor. (I will explain my subgrouping
in a moment)
2)
That
there are different groups of individuals called, say MacGregor, who share a
common ancestor in the fairly recent past but the connection of one group of
these MacGregors to each other is likely to lie thousands of years in the past
and pre-surnames
3)
That
there are multiple origins for surname groups
4)
That
clan sept names have varied genetic origins and not just one single origin
within the period of the existence of surnames (surnames become more common after
the mid 14th century, particularly in England following the Black
Death as there was more movement of labour to replace those lost to the plague)
The MacGregor DNA project at this date has 1427 members. In
order to make the results easier to navigate I have divided them up into
subgroups. The advantage to this is that within a subgroup it is easier to see
individuals who are potentially related, in the more recent past, to each other,
because they appear in the surname grid near to each other. It also shows up
where there are multiple origins for surnames, and is particularly useful where
some members of the subgroup are able to indicate an earliest known ancestor.
In a good number of cases that ancestor will be shared with other members of
the subgroup and some of the group may not have the genealogical information
that others have and so find their link to the past. It can also suggest
geographical location for the ancestor. The disadvantage is that members of
different subgroups in the results grid cannot quickly compare their results with
members of other subgroups, but in a Y chromosome study that is less of a problem.
In any case everyone has access to a ‘Matches’ tab on their DNA results page
and also everyone can use Ysearch and Mitosearch.org.
To further illustrate what can be
done I will demonstrate with one subgroup. Regular readers of this blog know
that I have offered to make any comparisons between individuals that are
wanted. It would be helpful if this could normally be limited to 10-12 results
for comparison. Also, please remember that I have to be able to compare like
for like – so I cannot compare a 37 result with a 67 – I can only use the first
37 of the latter for the comparison.
Example 1: Greig/Grieg/Gregg etc surname 37 marker result
grid
The first example shows all the 37 marker results for this
subgroup. No colours are present in this grid to show mutations because we are
not trying to compare individuals to one ‘master’ result or even to an
‘average’ or modal result. It can be hard without colours to see how one
individual is related to another but in general if results are closer together on
the grid they would tend to be more closely related genetically (but just being
close on the grid does not necessarily
pick up results that are more related than others). That is why we use a
graphics program to make comparing results more visual in what is known as a phylogenetic
tree. Also the program used picks up similarities across results that might not
be immediately obvious to the eye.
Example 2: Greig/Grieg/Gregg etc surname 37 marker
phylogenetic tree
The second example therefore shows the chart in graphic form,
generated by Splitstree (acknowledgement is given at the bottom of this blog).
I have labelled them as the chart is labelled so perhaps the easiest way to
make comparisons is to print off both examples and cross-refer between them. In
this graphic representation some relationships become more obvious but there also
are some surprises. The first, and most important, point to make is that we are
seeing at least 11 distinct family groups who each shared a common ancestor many
thousands of years ago long before surnames became common. So we see at least
11 different genetic origins for people called Greig, Grigg, Gragg and Gregor.
In general, if results are closely clustered together on the graphic then they
probably share a recent common ancestor (and by recent I mean since the
acquisition of surnames).
1)
There
are 7 individuals (kit 20673 is one) who share a common ancestor who could be
the William Gregg born in 1616 or his immediate forebears - an early emigrant
to the New World. This family have no spelling variations – always Gregg – so
may have been literate from the earliest emigrant
2)
There
is a group of 3 (kit 214992 is one), who are related and may have a connection
to Tipperary in Ireland: these families are Gregg or Gragg
3)
There
is another group of 3 (kit 239449 is one) - these connect to a common ancestor
but there is no indication of who this might be in the genealogies submitted –
these used spellings Grieg, Greig and Gregg
4)
Another
group of 4 (kit 158127 is one) who seem to connect to Antrim in Ireland (using
Gragg and Gregg)
5)
There
is a group of 3 (45360 is one) who connect to Edinburgh and Pathhead (a nearby
village)(surnames are Greig and Gregg)
6)
There
is a group of individuals who are genetically related and with geographical
links to North East Scotland with very different versions of the name: Griggs,
Greig and Grigor (9690 is one of these)
7)
A
group of 2 (kit 6979 is one) using Gregg (no locations available)
All other individuals appear to belong to separate unrelated
families although the distant possible connection between Gregor(y) 476609 and
Charles Greig 585177 would be worth further investigation. Robert Gregor 239031
belongs to a completely different genetic haplogroup.
Example 3: Greig/Grieg/Gregg etc surname 67 marker grid
This example [Examples 3 and 4] shows the smaller group of
individuals who have tested 67 markers (they are all in the 37 group discussed
above). What we are interested in is whether the greater number of markers
gives any further information on genetic connections. Since 50% of the sample
shown in Example 1 and 2 is now not present, the graphic representation is much
sparser.
Example 4: Greig/Grieg/Gregg etc surname 67 marker
phylogenetic tree
The problem that is immediately apparent [in Example 4] is
that without a larger number of individuals testing to 67 (or more markers)
family groups do not break down significantly further. In Example 4 the only possible confirmation
seen is that Kits 7489 Gregg, 214992 Gragg and 9690 Greig may share a common
ancestor in the relatively distant past but it is possible that all three share
a geographical origin – North East
Scotland as suggested in point 6 above.
***
By way of comparison I have used the same processes on the
Gregory group who also have diverse origins, but what is particularly
interesting with this group - given that membership of a DNA surname testing
project is essentially based on random participation - is to see just how many
individuals descend from the same ancestor: almost certainly, given that there
are forebears in common (for example Gideon Gregory – kits 58711 and 179683)
then this group probably had a common emigrant ancestor in the United States.
Example 5: – Gregory surname 37 marker phylogenetic tree
Apart from that group of related individuals there are only
4 other, much smaller groups, whose individual ancestry lies close to each
other – their ‘earliest known ancestor’ as given by each participant, suggesting a range of possible genetic origins
(see the grid on Example 6)
Example 6: Gregory surname 37 marker grid
Dean McGee’s DNA Utility allows an estimate of time to most recent common ancestor.
Example 7: Gregory surname 37 marker Time to Most Recent
Common Ancestor grid (partial)
This
is only an estimate and the number of years suggested always depends on the
confidence level chosen for the program – choosing 100% confidence would give a
different result from choosing 10% confidence. In this example from the Gregory
charts we see an estimate of the possible time to the shared ancestor for the each
individuals in the group with each other person. In order to see a good number
of results I have had to remove the labels from the top grid but they can
simply be put in by hand - going from left to right on the grid top line in the
same order as reading top to bottom. Notice, for example, that comparing the
first two individuals ‘Peter R. Gregory’ 275887 and Gregory 37140 suggests that
they share a common ancestor 5220 years ago.
All of these analyses
benefit greatly – and benefit other genealogists – if testers indicate the name
of their earliest known male ancestor with the surname – no matter how recent
that might be.
If we now look at the phylogenetic tree created when we use
only those kits that have tested to 67 marker level the only real difference is
that some of the genetic distances seems to be clearer.
Example 8 Gregory DNA 67 marker grid
However, this particular program only
separates by mutation – so if we look at the Gideon Gregory results again it looks
like they come from different lines of the same family, rather than from the
common ancestor Gideon. This is a limitation not of the program but of an
ability to input into the program that two results come from the same ancestor.
We have to remember that programs such as this were not designed primarily for
family history but for comparing genetic markers in species and not just the
human species. After all, it would be next to impossible to say whether two
turtles shared the same great great grandfather …
The development of
SNP analysis
For several years now there has been an increasing focus on
the testing and analysis of SNPs (single nucleotide polymorphisms). The
difference between these and the more commonly tested STRs was given in my 2014
blog (opening paragraphs). To put it simply, SNPs are markers in time: as far
as is known if these mutate they stay mutated in subsequent generations. What
that means is that once enough SNPs are identified an element of dating can be
applied to when the mutation happened. For family historians this fact is becoming
hugely important. Dating SNPs to the time before surnames is of limited use to
family historians but to have dates, even approximate, from the time after the
adoption of surnames means that family surname groups can be split down into
smaller and more recent family subgroups.
On the Greig and Gregor grids I show the SNP information which is
assigned to each individual in the leftmost column. In most cases this is
simply M269, a SNP that happened thousands of years ago. Some individuals have
had some SNP testing done but very few people have had the ‘Big Y’ test done
which takes results forward in time towards the present day, and identifies SNPs
which may have happened between 500 and 800 years ago.
Greig SNPs
In the Greig grid confirmed SNPs are in green. M269 is too
early in date to be considered so the only other SNPs to be taken into account are
kit 476609 R-L066; 363402 R-FGC10125; 259416 R-U152; 585177 R-Z253; 195430
R-U106; 295321 R-FGC5494; 404866 R-FGC37100; 110496 R-L21; B196295 R-ZP77; and
239031 T-M70
Of these U152, R-Z253, R-U106, R-L21 are well known early
SNPs which happened before surnames, sometimes by thousands of years and most
have further testing options available to bring the results further forward in
time. FGC in the results indicates that the SNP was identified by the Full
Genome Corp (as indeed the other letters identify the source lab or individual
who identified the SNP in question in the first place). Of the other SNPs:
L1066 is more
recent but still before surnames.
is the next SNP in the sequence for some individuals after L1065 [not the same as L1066] which is
said to identify the Scots modal group
[see http://www.ytree.net/DisplayTree.php?blockID=160]
Since L1065 roughly dates to 1750 years before the present, FGC10125 may have happened before
surnames.
FGC5494 is
European in origin but again is somewhat earlier than surnames, and has SNPs
which descend from it towards the present time
FGC37100 is a
descendant, or technically, ‘downstream’ of L151 – that SNP is again an much earlier
one and found in England as well as other places.
ZP77 is the same
as FGC6562 and is found in
concentrations in Ireland and to a lesser degree in Scotland: it also has
numerous downstream markers
Finally T-M70 is
an very early SNP with a distribution over southern Europe, the Middle East and
East Africa. It is comparatively rare among tested individuals [see www.yfull.com/branch-info/T-M70/
which dates it to 16,000 years before the present].
Similar discussion on identified SNPs could be done for all
the subgroups in the MacGregor DNA project. The Gregory group, for example, has
the following identified SNPs (see example 6):
R-S16906; DF21; R-CTS7678; L48; Z343; R-P312; R-L1336;
R-BY15955;R-S691
MacGregor DNA –
current SNP analysis
The work which Neil McGregor in Australia has been doing in
analysing MacGregor SNPs is not concerned with the earlier SNPs. We already
knew from Jim Wilson’s work that most MacGregors in the main line group carried
the SNPs S690+ and S697+, both probably dating from after 1200AD (though no
absolute dating is yet available). In his analyses Neil has begun to break down
the test results of MacGregor participants into individual family groups –
which the clan has known about for generations and which are referred to in
older documents as the ‘houses’ or ‘sleik’ [of] Clan Gregor, the main ‘houses’
being Glenstrae, Roro, Gregor McIan (or Brackley) Dugall Keir and more recently
‘of Glencarnock’ the Chief’s line (Glencarnock is the area they held from the
mid 18th century).
Neil’s current identification of family groups is given in
Example 9:
Example 9: current predictions of SNPs associated with
MacGregor family groups.
Neil’s email to me allowing this to be included in the blog suggests
that MacGregors from the main line should seriously now consider doing the BigY
DNA test (rather than FGC – Full Genome Corps – with whom we have also
undertaken testing). He says:
“The best recommendation is that
people get BigY as everybody seems to have between 3 and 8 separate SNPs which
will allow them to be separated from everybody [else], other than from their
own immediate family or first cousins. Some of them [those who have tested
under BigY] appear to have a cluster of SNPs which appear to have mutated
together and may represent one mutation. A mutation seems to be as low as once
per generation through to once every 4-5 generations – seems related to the
number of STR [the marker scores than participants start with] mutations as
well.
The clan seems to be divided into
two major clusters and this would appear to be early on. The section I am in
has at least 3-5 sub-branches as does the other major group. The dividing SNP
appears to be BY28714”.
Just to repeat that I can do
comparisons of STR results for individuals – comparing with up to 10 to 12
others. I would repeat Neil’s encouragement to do BigY if you can – please ask
me for further information if needed [richardmcgregor1ATyahoo.co.uk
substituting @ for AT]
I have just had a comment from EMC which is worth repeating here in case folks miss it:
It is important to note that the results Neil has recently shown are also due to FTDNA reprocessing BigY kits under a new genome reference called HG38. Prior, under HG19, many of the SNP's used now were heretofore unknown. This SNP refinement is important.
It is important to note that the results Neil has recently shown are also due to FTDNA reprocessing BigY kits under a new genome reference called HG38. Prior, under HG19, many of the SNP's used now were heretofore unknown. This SNP refinement is important.
Charts were
constructed using Dee McGee’s Utility at http://www.mymcgee.com/tools/yutility.html?mode=ftdna_mode, using a 75% level of confidence, on
Doug MacDonald’s mutation rate, an average of 30 years per generation and with
no modal results assigned. The graphic representations of phylogenetic trees are
made by Splitstree:
-->
D. H. Huson and D.
Bryant, Application
of Phylogenetic Networks in Evolutionary Studies, Mol. Biol. Evol., 23(2):254-267, 2006