01 January 2017

MacGregor DNA Project January 2017 update

I used the 2015 update of this blog to explore the various tests available. Since then there has been a noticeable increase in the number of individuals taking the test known as ‘Family Finder’, or something similar, rather than Y chromosome or mtDNA tests. This has probably been as a result of quite aggressive marketing by Ancestry.com [it has a variety of website endings depending on where it is based] in particular. This has promoted the equal use of DNA testing for both males and females and tied it into the submission of family trees which individual testers can use to identify the same family name(s) with others who have tested and submitted their genealogies. What has perhaps been rather glossed over in this is the fact, firstly, that DNA gets ‘lost’ over time – if it didn’t we would have the DNA of billions of ancestors in our bodies, and, secondly, that it is only a tiny portion of our DNA which is currently being examined for genealogical purposes. In relation to the latter, if you took all the DNA out of your approximately 100 trillion cells and stretched it out in a long line, with DNA in one cell being about 2 metres long, it would reach to the moon and back about 8000 times. By comparison, if you did a similar process to your veins, arteries and capillaries they would measure about 100,000 kilometres (or 62,000+ miles) or roughly twice the height, or thereabouts, that aircraft fly above the earth.
     You do not inherit 25% of your autosomal ancestry from each of your 4 grandparents. This is because your autosomal DNA is randomly recombined, and not in equal proportions from each parent, and so the more you go back in time the percentage inherited from people in a particular generation becomes smaller and smaller and therefore the more distant the ancestor is the more difficult it becomes to identify what you received from that person. What then are the chances of that same bit of DNA being preserved from a specific ancestor in yourself and someone else? For example, if your name is, say, Smith, and your male MacGregor ancestor lived 10 generations ago on your mother’s side it is really not feasible with today’s technology to identify that DNA by looking at your DNA today. The tests which are offered by Ancestry, Family Tree DNA etc. only try to identify links to 5/6 generations back. The key thing to remember is that if you and someone else have, say, people called Brown in your trees it does not necessarily mean that you have a recent ancestor in common, or indeed that you have any ancestor called Brown in common at all. For these tests of connection to work properly, you, and the person you are comparing with, need to have as much genealogical information as possible on every ancestral line in your respective trees, going back 5 or 6 generations (as well as a significant shared portion of DNA). This information can be displayed on a fan chart such as this one, which can be viewed at:
Fig 1. fan chart example - Perry

You and the person you are comparing with both have to put all your family information into one of the computer genealogy programs (like Family Tree Maker, Reunion etc.), save it as a GEDCOM file, and upload it to whichever DNA company you have tested autosomal DNA with. The possible links between the two family trees are then highlighted in some way which allows a comparison of ancestry to be made in order to explore if there is a family match on some surname. The fact that there is a match on surname does not necessarily mean that it is the same family, only that there is a surname in common. Clearly the more unusual the surname, the more likely that the match will be with the same family. In my own case the extensive GEDCOM file that I have made found a name match in another equally detailed GEDCOM tree which linked to a common ancestor born towards the end of the 18th century in the Volga German colonies in Russia, although in fact this was just confirmation of a previously suspected connection found by traditional genealogical research (I’ll come back to this later). Clearly autosomal testing COULD be very valuable, despite these caveats, for females who cannot take advantage of the Y chromosome test which only men can do since it relates directly to father’s surname (unless of course a father, brother, or cousin who has the name of interest will do the test on the female’s behalf).
     There is another benefit to autosomal testing in that portions of a person’s DNA can be compared with typical results from other ethnic groups. So, for example, a person might find through this utility (sometimes called My Origins or similar) that they have, for example, Mediterranean, African, or Native American ancestry as part of their genetic makeup. This won’t be a link to a specific person (unless of course this genealogy is already known or at least suspected) but is often of interest when an individual wants to know the answer to the question ‘where did I come from?’. This is explored a bit more later in this blog. What we invariably find is that our most ancient ancestry is incredibly mixed and that our own personal ethnic makeup links back to many different regions and races. The Economist discussing the work of scientist and geneticist Luigi Luca Cavalii-Sforza remarked that his work "challenges the assumption that there are significant genetic differences between human races, and indeed, the idea that 'race' has any useful biological meaning at all".

SNPs
     I want to spend most the rest of this update considering the current and possible future significance of SNP testing. A SNP, or single nucleotide polymorphism, is a DNA ‘event’ which functions a bit like a marker in time. At the moment, I am here talking about SNPs which occur in the male line (and are often relatable to surnames). As far as is known, once an SNP occurs it does not re-mutate backwards. Because of this fact, if we can identify specific SNPs, as a first step, and if we can then date them, even approximately, we find ourselves on the way to constructing family trees, not just in prehistory but, potentially, for SNP events which took place within historic time. More than a decade ago Ken Nordtvedt and Dr John McEwan worked on ways of grouping STR results [that is, results from the standard male Y chromosome test] to show how some numerical patterns were constant within defined groups. This was limited at the time by the fact that only 37 markers were available in STR tests (whereas now one can test 67, 111 or even more). McEwan identified 49 groupings, and the one he called R1bSTR-47 came to be known as ‘Scots’.  This was the DNA profile based on 37 markers that was identified as ‘Scots’ (Fig. 2):



 Fig 2.  The ‘Scots Modal’ and ‘MacGregor Modal’ compared
The lower figure was the modal figure for the MacGregor group who claimed descent from Ian Cam (who is in the record of obituaries written down by Sir James MacGregor Dean of Lismore as having died in 1390). What was exciting at the time was the realisation that with only three mutations different, the MacGregors seemed to be a group who had mutated a little way, but not far, from the Scots group. Since then some commentators and researchers, particularly Alistair Moffat and Jim Wilson in their book The Scots: A Genetic Journey have suggested that the MacGregors were actually Picts. Without wanting to go into the arguments for or against this interpretation in any depth I did want to ask: if these were the Picts then who are the Scots [this is an inversion of the way the question used to be asked]? If you look at the various DNA discussion boards you will see that many people disagree with their interpretation. It is nonetheless important to repeat it here because the definition of Pict was based, by Jim Wilson, on his interpretation of the SNP S530 which he found and named when he was attached to ScotlandsDNA [the company now has various versions of the same name such as IrelandsDNA, BritainsDNA and so on]. 
     Following on from this, S530 was found to be equivalent to L1335 [the confirmation came in 2012] and the search was on to find what SNPs were more recent in time than L1335/S530.  Four years ago not enough was known about SNPs to attempt a time estimate as to when they occurred, but since then the group known as YFull have given broad estimates, from their research, of possible dates for each SNP. At this point I am concentrating on just the MacGregor male line but later in this blog I will make reference to another line to demonstrate how SNP testing is currently revolutionising our understanding of genetic/clan genealogy going back into the past beyond the time of written genealogies. What we still lack are more SNPs coming forward into the time of written records although there are some SNPs which are beginning to divide up families into smaller subgroups with the same surname, particularly when associated with STR results.
     At this point I repeat a chart which I first included in the 2015 blog. The chart was derived by Jim Wilson from work which he did in relation to clan DNA origins through the ScotlandsDNA company (Fig. 3).


Fig. 3: The SNP tree of the Scottish clans as at 2014

This helpful map from Wikipedia shows how the clans relate to each other geographically (Fig. 4):



Fig 4.  Scottish_clan_map Wikipedia Commons.png

Taking each of the SNPs from S530/L1335 onwards, YFull have given the following time estimates:

L1335 (also known as [aka]S530) formed 4300ybp (years before present), and time to most recent common ancestor 3600ybp
L1065 (aka CTS11722 or S749) formed 3600ybp, time to most recent common ancestor 1750ybp
S744 formed 1750ybp, time to most recent common ancestor 1750ybp
S691 formed 1750ybp, time to most recent common ancestor 1700ybp
S695 formed 1700ybp, time to most recent common ancestor 1550ybp [or, c.320 AD to c. 470 AD]

It not clear from the above how YFull distinguish the time frames of the later SNPs as they seem to be suggesting that they all arise around the same time, although, other commentators on DNA have speculated that there are approximately 90 years on average between SNP DNA mutations – but –  that assumes a degree of regularity in the occurrence of mutations, which we know is not the case.
     If we work in reverse from S690 [see Fig. 3 above] which is the defining mutation of the MacGregor bloodline, and which, we think, could have arisen as late as 1360 AD and perhaps as early as 1200 AD, then, using the standard SNP sequence going backwards in time, S695 might have occurred as late as 1270 AD and as early as 1110 AD.  Then S691 above S695 as late as 1180 AD or as early as 1020 AD. Ian Cam MacGregor descendants will notice that there is no mention of S697 here. We simply don’t know enough about this SNP or FGC17830 which seem to be typical of the MacGregor main line to say anything definite about them at this point. Indeed, we know little about S701, FGC17831, S703, FGC17832, S27834, BY144, FGC17829, S27835, and BY143 which are currently found as positive ONLY in the SNP results of MacGregor bloodline and in two Buchanan men. (By the way, it is very interesting that almost all other Buchanan men descend from the next level up on the ancestral tree as shown by the results collected and displayed on the excellent site run by Alex Williamson). The section for descendants of SNP L1335 can be found on Williamson’s site at: 

Fig 5. MacGregor/Buchanan section of Alex Williamson’s The Big Tree

If YFull are right in their estimates then it is quite possible that these SNPs [from S701 onwards] fill the gap between c. 500 AD and c. 1300 AD, implying that MacGregors remained a homogenous group from 500 AD onwards BUT this seems highly unlikely, to say the least. Looking at Williamson’s trees it is quite clear that there are many SNPs about which we need more information. Williamson shows the following as the sequence of SNPs from L1335 (and this same sequence is shown as part of the Scots Modal Panel which can be ordered from YSEQ):

Top of Form
L1335/S530 > L1065 > Z16325 > S744 > S691 > S695 > S701 > S690



Fig 6: Section of the L691 descendant SNPs from YSEQ

This image, which is taken from the YSEQ website shows just how complex the clans family tree is becoming, and changing as we learn more and more about SNPs. It is clear that the relationships which exist between Scottish clan groups will eventually be refined in such a way that the traditional stories of clan origins will either be confirmed or refuted. So, even though now we cannot say with absolute confidence what the exact sequence of SNPs leading from L1335 to S690 is and cannot assign secure dates for their first appearance it will only be a matter of time before the confidence levels on all this become much clearer. It is remarkable how much has been achieved in just four years.
     For comparison, I attach another small section from Alex Williamson’s Big Tree – in this case for the McLeans (Fig. 7). We can see in this that the family groups are beginning to divide up based on individual SNPs separating family groups. The McLeans, like the MacGregors, have a small number of individuals who are quite separate from the mainline, in having the SNP M222 which often has Irish connections, or other SNPs which seem to indicate adoption of the name by different families during the time of change from patronymics to surnames.


Fig 7: McLean section from Williamson The Big Tree

What are patronymics?
     The use of patronymics versus surnames is not always well understood, so a brief discussion here might be helpful. Years before surnames became common in Scotland an individual might be referred to by his geographical location – so in Clan Gregor we have John of Glenorchy who flourished in the 13th century. It has always been believed that this John was one of the first MacGregors, but without a surname we have no real way of knowing if he was indeed a MacGregor – there’s always the possibility that he was an early Campbell. After the middle of the 14th century, and in Britain following the devastation wreaked by the Black Death, not only did individuals move around rather more than before but individuals were less tied by servitude to landowners. In Scotland individuals who managed to acquire some land-holding capability began to use surnames to identify family groupings, whereas, the common people around them would continue to be identified by their family relationships in the formula ‘the son of and grandson of’ (or ‘daughter of’ in the case of females) another male forename. Thus, whereas the most important MacGregor might be known as Patrick MacGregor of Glenstrae, the under tenants, who might, or might not, be distantly related to him, would be known as, for example, Patrick McEwan VicConachie [son of Ewan, grandson of Duncan]. As late as the middle of the 18th century the rental documents of the Menzies estates were still referring to males by their patronymics, but, by the end of the century all tenants had acquired and been identified by fixed surnames. These families might indeed be genetically MacGregors but they might equally be genetically Menzies or Drummond, or they might never have been genetically linked to one of the main families in that area at all. It is because of this variety of means by which names were acquired that clans include individuals who have a variety of genetic origins.

DNA testing companies and their products.
Y chromosome tests:
     It can be very difficult for people new to DNA testing to work out what the best test is for them. The answer depends on what question a person wants answered. If a male wants to know about his surname and its origins, then the Y chromosome test is in practice the only option. Very little is gained however by simply doing 12 or 25 markers because comparisons with other people with the same surname are only effective when comparing 37 or more markers. Many of the programs which help to make comparisons with another individual’s marker scores work best with 67 or 111 markers, and some do not work with less than 67 markers. The advantage to participants of an FtDNA (Family Tree DNA) Y chromosome project is that the company has the largest publicly available database of Y chromosome results for comparison. Surname projects allow direct comparisons with others, but it is possible to keep results private and not visible to the general public, although there is no way to identify an individual testee based on kit number alone.
     One of the results of Y chromosome testing is that an initial assignment of the results to a haplogroup is made. For the majority of Western Europeans this will be R1b-M269 while most others will be I-M253 [originally I1], or I-M223 [originally I2] (both associated with Scandinavia, and originally from the Balkans), and R-M512 (or R1a), whose origins lie in the Steppes. What the numbers indicate is an SNP that defines a specific group (for example, R1b-M269 is Western Atlantic origin). Some companies, however, such as 23andme, still classify individuals by alphabetic lettering. Thus, in my own case 23andme labels my paternal line as R1b1b2a1a2f* ‘a subgroup of R1b1b2’ which is a ‘subgroup of R1b1’. R1b1b2 then described in the accompanying explanation as:

·       Age: 17,000 years
·       Region: Europe
·       Example Populations: Irish, Basques, British, French
·       Highlight: R1b1b2 is the most common haplogroup in western Europe, with distinct branches in specific regions.

Hopefully the earlier part of this blog has shown why this doesn’t say much about recent genetic connections.
     For I1 [I-M253], for example, the 23andme include generalised results for some famous individuals, including Leo Tolstoy and Warren Buffett. It’s unfortunate that some newspapers then choose to interpret such results as ‘meaning’ that a named person is ‘related’ to Warren Buffett (yes they are, but probably cousins at 10,000-20,000 years distance), or, even worse, that a named person is a ‘descendant’ of the historical figure (and that is extremely unlikely unless it happens to be Genghis Khan!!). In this way completely false stories about DNA relationships spread.
    The origins and spread of these haplogroups are shown on the attached map found in several forms on the internet:


Fig 8: Origin and Spread of haplogroups R1a, R1b and I

What about Big Y or Full Genome testing?
Big Y
FtDNA advertise their Big Y as follows:
“Nearly 25,000 known SNPs, placing you deep on the haplotree.
10 Million base-pair coverage - more than any other Y-DNA test on the market.
Find SNPs that may be completely unique to you.
Explore your deep paternal ancestry
Help the community uncover new, undiscovered SNPs.
Use your newly discovered SNPs to help grow the haplotree”.

Whereas FGC (Full Genome Corporation) offer:
The “GenomeGuide, a whole genome test for ancestry purposes, and Y Elite 2.1 a comprehensive test” of a person’s “Y chromosome. Y Elite 2.1 determines those markers (i.e. SNPs and STRs) that are most useful” for a person’s “paternal ancestry”.

Both these tests aim to locate SNPs on a male Y chromosome and may include SNPs classified as ‘private’, meaning that at this point in time they have only been found in a single or very limited number of individuals, and their exact significance to the more general tree or to an individual’s personal family tree has yet to be confirmed.
    It will be clear from the above product descriptions that FGC’s offer is more comprehensive (and they have other versions which probe the Y chromosome even more thoroughly, but cost considerably more). The essential problem lies in identifying which test, if any, gives the most useful information. Some project administrators make suggestions as to which more comprehensive test to take, or, they highlight specific SNPs that an individual might choose, but, these usually build on previous testing rather than being aimed at people starting to look at SNP testing for the first time. A good starting point is to observe what SNPs others in a group have already tested (FtDNA show these as ‘confirmed SNPs’ in green). Individuals who don’t know here to start with SNP testing do need to look for help from a project administrator regarding which SNP(s) to choose. If we take M269 (for group R1b) for example, in many projects in FtDNA this will show in red, meaning the SNP is predicted but unconfirmed. Normally the prediction is correct. If starting from this point probably the best thing to do, short of going straight to one of the two big tests mentioned above, is to have SNP L21 tested for positive or negative. If a person is L21 positive and doesn’t want to go down the line of Big Y or FGC testing then the next step, having looked at any confirmed green entries for SNPs in the sheets of excel data for people lying nearby in the grid, is to go for an L21 SNP Panel either with FtDNA or with YSEQ.com (but using the latter will require a new registration and a new sample, although their pricing is competitive). If SNP testing is done with FtDNA, their results program will usually suggest what the next SNP tests might be. At a certain point in testing it is definitely worth (if only financially) trying an appropriate SNP Panel. For instance, results in the STR Y chromosome tables for a surname project which lean towards L1335 suggest that that would be a good STR Panel to test. Both FtDNA and YSEQ offer L1335 panels as well as individual SNPs (but doing one SNP at a time can get expensive).
     Just to re-emphasise: the advantage of the more comprehensive tests is that ‘private’ SNPs are often identified. Sometimes these are unique to an individual but sometimes they will be found in several individuals and therefore they may well define a discrete family group from within the historic period. However, in order to identify these as belonging to more than one person, other people who seem to be closely related (when looking at the other DNA male line results) need to test for the same ‘private’ SNPs. Many surname groups are working to try to identify these ’private’ SNPs for family groups both to advance genealogical links but also to save participants some money!

Health related issues
Most DNA testing companies do not give reports which include information about health risks. Exceptionally, 23andme have offered health related reports in the past but after difficulties in America with the FDA they suspended these reports, but later reinstated some for the non-American market. These tests do not have a genealogical component and therefore will not be discussed further.

Ethnic mix
As mentioned earlier, FamilytreeDNA, through its MyOrigins report, ScotlandsDNA through Ancestry Painting, Ancestry.com through the AncestryDNA test, and 23andme through Ancestry Composition, all, with some variations in reporting procedures, aim to give an individual a ‘picture’ of his or her ancestral connections with populations around the world. Results naturally vary considerably from almost 100% European to real mixtures of different ancestral backgrounds including American Indian, Far Eastern, African and so on. Ancestry for example says that their DNA test ‘looks at a person's entire genome at over 700,000 locations’ and covers ‘26 ethnic regions’. Ancestry.com claim to have ‘more than 2 million people’ in their database and ‘the unique ability to connect with Ancestry’s billions of historical records and millions of family trees’. For further information on these tests and how they report see my 2015 blog on this site.

Discovering distant relatives’
The reference to Ancestry.com in the above paragraph was deliberate. On the one hand the ability to contact other members part of whose DNA is the same as one’s own is clearly attractive. This is exactly what I was referring to in the opening paragraphs of this blog. The difficulty is that Ancestry does not remind you to check that the information you receive from others is actually accurate. Many a false genealogical connection has been made through eagerness to get back as far as possible. What many people do not realise is that the written records on which genealogies are constructed can be missing for some areas of the world. Even in Scotland the records for the counties in the very north are missing for many localities before 1800 and almost universally before 1750. Wars and carelessness, as well as the wide dispersal of the populations in remote locations meant that children might well not ever be baptised, or if they were, it was done whenever the minister happened to be in the locality. However, it was the parish clerk’s job to keep the records, not the minister’s, and the parish clerk might be tending his cattle 20 or more miles away.  The same cautionary statement holds true of FtDNA’s Family Finder in that what appears in an imported GEDCOM file only represents the family researcher’s work and, as with all internet genealogy, needs to be checked for accuracy.



Fig 9: Screen grab from Family Finder proposed matches

In this screen grab from Family Finder, for the sake of privacy and data protection, I have removed the picture details of matches including the email of the individual whose family includes an individual related to my own family. The match is Charlotta Major but she is not an ancestor in my line, but her father Konrad born in 1797 is. This then is not an MtDNA link (and in any case the person who is my match has not tested this, nor, being female, could she test the Y chromosome), it is an autosomal link with a male line which is my mother’s great grandfather. I have, however, been unable to identify any links with the other individuals listed as matches.

Which company then?
As I said earlier the choice of company depends entirely on what question or questions you want answered:


Fig. 10: DNA testing company list

Bottom of Form

-->
I have not drawn out trees based on STR Y chromosome results this year [that is, those that appear in a chart for people in, for example, an R1b group as having a number sequence like 13, 24, 14, 10, 11,14 etc.] since these results are too diverse and complex when making a comparison between surname groups in the project which now has over 1200 participants, and sometimes even too variable within a surname project name subgroup [as for example Greer, Grier, Grierson in the MacGregor Project]. In short, there are now too many people in the project to do comparison charts that would have any real meaning. Also, the amount of detail would be far too great to permit any links to be seen. Because of this I repeat here my usual offer in relation to those who have tested their Y chromosome through STR tests. If you wish me to run a comparison with other participants, then please state the group or individuals with whom you wish to be compared and I will make a personalised graph for you and help you interpret the results. Please note though that it is only feasible to compare like with like (i.e. 67 markers with 67, 37 with 37). As usual my email address is richardmcgregor1ATyahoo.co.uk (substitute @ for AT). Please contact me offline also for advice on SNP test choices. Could members of the Ian Cam MacGregor group [the bloodline group] please note that the terminal SNP for the group is currently S690 and we do not yet have any ‘private’ SNPs to recommend, other than S696 and S698 which seem to be carried only by the Glencarnock line, and may have arisen in the last 250-300 years. Apart from the two known carriers of these SNPs other members of the Ian Cam group who have tested these SNPs have found them to be negative.

05 January 2016

MacGregor DNA Project Update January 2016

MacGregor DNA project blog update January 2016

Welcome to the annual blog update of the MacGregor DNA project.  Last year I gave an introduction to the various DNA tests and what they could show. In the past year, while numbers in the project have continued to rise, most people seem to be opting for either the Family Finder test, or for SNP testing, in order to determine where on the human family tree they lie. For this year’s update I have decided to focus in particular on two case studies –  one of which has been supplied by Neil McGregor and Matt McGregor in Australia. I’ll get to these in a moment.

When I wrote about Family Finder last year I didn’t really emphasise the importance of uploading a personal family tree into a personal DNA webpage. If you are familiar with .GEDCOM files this will need no introduction but for the benefit of those unfamiliar with the file format I hope these brief comments will help.

There are numerous propriety software packages on the market for recording family tree information and since, for once, the online community has agreed on a common format for exporting files, it does not matter which is used.  For PC a commonly used program is ‘Family Tree Maker’ and for MAC ‘Reunion’ is often found (though Family Tree Maker for MAC also exists). In the main, all programs do the same thing so it really comes down to personal preference in the way data is entered and displayed as to which one to choose. The two I have mentioned are simply those that I use myself and should not be viewed as a recommendation.

The important thing about a GEDCOM file when used with Family Finder is that it creates a list of surnames which can easily be searched to see if any surname is the same in two, or more, ‘genetic cousins’.

Also new for this year’s update is the mitochondrial grouping for those who have done this test in the MacGregor project. It is very unlikely that exact matches will be revealed by this grouping (unlike male Y chromosome DNA) because MtDNA does not go with surname and it is quite unlikely that two people in what is essentially a male surname-based project  would happen to have the same mother’s mother’s mother etc, although if a close match on the full genome sequence for MtDNA were found, it might suggest a common emigrant ancestor.

The grouping of MtDNA results can be found at: 

As for previous updates I am conscious that it isn’t possible to cover all participants in the project in detail, especially now that there are over 1100 folk taking part. I am happy to make up some comparative charts for project members, much as will be presented below in relation to the Stirling connection. I only request that if you wish me to run a chart that you say the kit numbers or group with which you wish to be compared. I can only run these comparisons for the Y chromosome test – and it does not give useful results if less that 37 markers are compared: in other words, the number of individuals who match within one or two mutations on 12 or 25 markers is so great as to make comparisons meaningless.


The other limitation to note on these generated charts is this: a computer program will only do comparisons on statistical or numerical result similarities, and since DNA mutates randomly close family connections (that is, over the last 500 years) can get obscured by random mutations – some families’ DNA mutates faster than others. Various factors have been cited for increased mutation, such as age of father, place in the sequence of births for a couple, environment, diet, and radiation. The value of such comparisons therefore lies in the clues that it affords to possible connections: when paper genealogy fails then such clues can be very important!

Case Studies 1
The Stirling MacGregors

I was asked to compare the 111 marker results for kit number 13201 – representing the branch of the main MacGregor line who adopted the alias Stirling sometime during the 17th century. The story goes that a MacGregor (some say of the Glenstrae family, and some say his forename was Robert, or William or Duncan) was employed in the house of the Stirlings of Keir near Dunblane when the soldiers came looking for the MacGregor. They were met by the lady of the house who declared “there is no-one named MacGregor here” and from that time the MacGregor and his family used the surname Stirling.

While the family tradition is that the Stirlings originated in Glenstrae the evidence shows the first use of the Stirling alias is a “Willeam McGregour VcCoueill callit to ane toname (alias) Stirling” in the 1611a list [prepared for the Laird of Luss]. This person seems to be the William spoken of in the tradition above regarding the establishment of the Stirling alias. This makes him a William McGregor son of Donald.  The 1611a list (#81) lists him as being of the House of Gregor McAne and hence of the Brackley family. Searching the data would place him in the following tree.

Gregour McAne (b~1520 alive 1595).
   |- Eion Dubh (b~1550 executed 1612). #43 in 1611a list (In Glenurchy)
   |    |- Patrick (b~1570)
   |    |- Alasdair cass (b~1571 executed 1613) #44 in 1611a list (Eion’s son)
   |    |- ? #48 in 1611a list (Eion’s son)
   |    ‘- Donmall (b~1572) (#45 in 1611a list (Eion’s son).
   |        ‘- William (b~1590) #81 in 1611 list.
   ‘- Niall (5 sons)
This family came from the lands in Glenurchy and may have claimed “FROM GLENSTRAE” as distinct from “OF GLENSTRAE”.  [previous 15 Line comment received from Neil McGregor after seeing a draft of the blog].

In order to see whether there were some possible connections with other participants who had also done the 111 Y chromosome marker test I abstracted all the 111 results in the MacGregor Ian Cam subgroup [supposed founder of the clan who died 1390 in Glenorchy/ Glenstrae].  What was known to this point was that every individual who had the surname Stirling and had the genetic profile of a MacGregor had a specific mutation from 17 to 16 at marker 32 – a mutation not borne by any other members of the Ian Cam group. This suggested that this specific mutation had occurred by at least 1700.

Running the marker results through the comparison tool created by Dean McGee (this can be found at http://www.mymcgee.com/tools/yutility111.html) gives a possible “Time to Most Recent Common Ancestor’ result for comparing any two individuals with each other. This is only a rough guide to relative distance between two results and since we expect that the time back to the originator of the clan is probably 600-650 years clearly any figures which exceed this number of years in the grid would be too far back if we accept decent from Ian Cam [or supposed father Gregor]. However, the program is assessing relative distance between individuals on the basis of numbers of mutations, and in that context it gives a helpful indication of possible closeness of different branches of the family.

In the chart the kit number for Stirling is 13621 (click image to expand).

Figure 1: TMRCA for members of the Ian Cam MacGregor group
So, according to this grid, the possible time distance between 13621 and, for example, 2124 – the line of the clan chief – is found by reading along the grid line of 13621 horizontally until it meets the line coming down vertically from 2124. This gives an estimate of 460 years and a split point of c1500 AD. There has been a suggestion that the Stirlings are were originally Glenstrae MacGregors – this estimate would not rule out that being possible. Less likely would be any connections closer than 260 years because we know that there has been a Y chromosome mutation in the Stirling family that must have taken place  before c1700. Also less likely are those time distances which exceed 900 years  separating them from other participants – simply because surnames as such did not exist prior to about 1300 AD.  This suggests that SNP testing (on individual results which show a larger time interval from other results) would be worth pursuing to see whether or not the individual concerned has the MacGregor SNPs S690 or S697. As more SNP results appear it may be that we will have to reconsider the formation of clan name groups as predating the adoption of an identifying surname. That will be a while coming however. [If you wish to see current results google Alex Williamson’s Big Y tree].

It is important to note that not all the individuals in the above grid are called MacGregor. When the Ian Cam group was created individuals were added to the group on the basis of what their STR numerical results suggested (that is, they looked like MacGregor). In the above grid it is therefore significant that those which show the furthest time distance are 381858 Murray, 292892 and 350316 Stewart, 121048 McPherson, 120820, 189492 and 258767 McFarland and 237186 Hunt. Only two of the ‘further out’ results are MacGregor which suggests that even if the Y chromosome  STR results look similar, SNP testing will be necessary to find out if individuals belong to the main surname group or whether they split off earlier. So far the McFarlands have tested S690- but 292892 Stewart is S690+.

If we look at the chart which the above table generates we see that there are some interesting family groupings suggested:

Fig 2: 111 marker Ian Cam group chart


In particular, on the right hand side of the chart we see that many of the kit numbers mentioned in the last paragraph are all grouped together in very close proximity which seems to imply some closer genealogical connection. It would be interesting to know what the terminal SNP for each of these individuals is.

Most of the other branches appear to represent separate families though as I mentioned earlier it’s hard to say anything absolutely since DNA can mutate randomly even within the same branches of a family.  In the chart above it seems like Stirling kit 13621 is more closely related to McGrigor kit 256584. If this truly is a family connection then it must be before 1700 because kit 256584 does not carry the distinctive Stirling mutation at position 32.

It is useful to compare the locations of the Stirling results in the 111 analysis in Fig 2 above with those members mentioned in the following case study which makes a similar claim to Glenstrae origin but using DNA and documentation to make the case.

Case Studies  2
A surviving MacGregor of Glenstrae family?

The second case study is extracted from a longer version which will be published in the Clan Gregor Society Spring 2016 Newsletter, and will subsequently be available to download from this blog. The material has been put together and analysed by Neil McGregor and Matt McGregor in Australia using 111 marker results from Family Tree DNA, Big Y results from Family Tree DNA, and MacGregor results from ScotlandsDNA. [the following text is an abbreviated version of their article].

Analysis of the y-DNA is associated with the known descendants of John McGregor (John of Monzie) who married Ann Stobie on 14 June 1826 in Monzie Parish, Perthshire, Scotland. As the y-DNA only traces the male line we obtained data from three male lines descended from John of Monzie. These sons were Charles John (b 1836 at Glassworks, Alloa), John (b 1838 at Balmain, NSW) and James (b 1840 at Braidwood, NSW) (See figure 1 and tables 1-2 for the lines of the DNA samples and the results of the analysis). The data in tables 1 and 2 show that the family descend from the Clan Gregor hierarchical family and that the DNA from current descendants of John’s line (NRM & AAM) differs from the current descendant of Charles (GCM) line by 2 mutations (one in each from the common ancestor) and James’s line differs by three mutations from the other two. Both John and Charles lines differ from the Clan line by five mutations (389a and 389b appear to be a single mutation not two) and James by six mutations. The Y-DNA pattern of John of Monzie (b1790), the common ancestor, can be predicted and would have differed from the Clan line by four mutations - his Y-DNA data is given in Table 1. The actual dates from the birth of John of Monzie to all the tested subjects is between 150 and 175 years, being 4-5 generations. The calculations are based upon the McDonald mutations rates and using average generation time of 30 years results in a reasonably accurate prediction. James’s data indicates that the family may have a higher mutation rate than is normal but recent research has indicated this may simply be the result of where the mutations occur on the Y-DNA and not an actual increase in mutation rate.

Lines of the Y-DNA samples from John McGrigor (Monzie b 1790).

John McGrigor (Monzie born 1790)
   |- Charles John (1836) (Alloa Scotland) 3rd generation Graeme Chisholm MacGregor
   |- William (No samples acquired from this line)
   |- John (1838 - Balmain NSW) 3rd and 4th generations Neil Roland and Alexander Andrew James
   ‘- James (1840) (Braidwood NSW) 4th generation  James Hardy

From these data we can estimate the time to the Clan common ancestor (Table 2) to be ~1350 using the McDonald mutation rate for the normal population. This figure is different if we calculate the mutation rates based upon the higher mutation rate seen within this family’s actual data, the date to common ancestor being between 1380 and 1618 with a median of 1480.

 Table 1. Summary of the Y-DNA 67 marker test for the Clan and the two subjects and that of John (1790).


Table 2. Time to common ancestor (Years) for John 1790 and his three descendants.


Using the Y chromosome mutation rates proposed by Doug McDonald the separation from the Clan modal occurred ~1380, however, as mentioned above, using the individual mutation rates determined for this family’s actual data gives later values. The mutations rates of NRM and GCM give the separation date as ~1480 and for GH ~1618. This date related to GH is far too short to the present and the mutation rate therefore too high to be meaningful over all the generations so only the data based upon NRM and GCM will be used.

Examination of the Clan documented history only places three Clan Gregor family lines in the Monzie/Crieff parishes in the 1600s and they are: Glenstrae, Glenlednock, Roro and a group who used the alias McAra, whom the documents show are actually of the Roro line. These Clan Gregor families have their common ancestors at various dates: Glenlednock have been proposed to have separated from the Glenstrae line in ~1400 and the Roro line prior to that date. However these dates are only speculative. Thus, the DNA evidence supports the possibility that the Bega family [family of NRM who first settled in Bega, Australia] could be derived from any of these three families. We have approximately 18-20 generations between the above tested individuals and Gregor [name father of the clan] born ~1300, which equates to 1 mutation every 3 to 4 generations. Whilst this mutation rate is higher than normal, it actually is found in the Familytreedna.

Firstly, it is noted that the Bega [NRM] family has a unique mutation in the Clan dataset at DYS 389a and DYS389b of 14/31. No other tested subjects have this marker so we cannot be matched using that data. It is likely that this mutation occurred after 1500 and before 1790 and therefore would only be seen in recently related individuals. Analysis of the Clan Gregor Y-DNA database shows several distinct clusters with few mutations and several high mutation clusters. Analysis of these latter groups shows potential correlation between the mutations and certain geographic locations. Two of these appear to be sequential, and they are mutations at DYS576 at 16 and 17 when compared with the general Clan data at 18. Those at 16 are the Stirlings who lived in Dunblane and those at 17 from people who claim to be from the Glengyle family. Both these groups could have descended from the Glengyle although other data suggest that the Stirlings are from the house of Gregor McAne of Brackley and have lived in Glenurchy on the Glenstrae lands. It is very unlikely that the Bega [NRM] family are descended from the families of Roro, Glengyle, Brackley / Ladasach, or McRob. This strongly suggests that the Bega family are descendants of the one remaining family: the Glenstrae line.

Assessment of other genetic matches.

The Bega [NRM]family has a mutation of 9 at 459b. This is shared by 5 other participants in the Clan Gregor database.

Table 3 Analysis of subjects within Clan Gregor who have 9 at 459b.


This seems to be an early mutation in one branch of the family. These kit numbers were assessed using comparative Y-DNA analysis, and table 3 shows the mutations within the group. Table 4 shows the dates to common ancestor for the group and this suggests that the closest common ancestor was with kit number 191228. The date of common ancestor being ~390 years ago or ~1560 using the same mutation rates as for Neil [NRM] and his cousin Graham in the analysis. The oldest known ancestor for kit 191228 was Alexander McGrigor born in 1790, living in Lanarkshire.

Table 4. Time to common ancestor for the members of the analysis.

Kit 94903 has their oldest known ancestor in Lanarkshire in 1790. Kit 119330 has their oldest known ancestor in Buchanan in Stirlingshire, and 121048 has their oldest known ancestor as one James McPherson from Abernethy/Kincardine and that they lived on the lands of the Grants. Kit 120679 did not report a location or ancestor.

The important factor about these different family locations is that they are very widespread across the Highlands and part of the lowlands. Whilst the dates and locations may be related to movement during the “clearances” or post Jacobite wars (1745/46), as appears to be the case with kit 191228, the others cannot be attributed to these events. This suggests that the 9-9 mutation group have an early mutation from a family that is now well scattered – most likely due to the early habitation of the occupation sites, movements during the proscription or some other upheaval. If the mutations occurred early in the family then the low numbers in the database could suggest that many of the 9-9 kindred may have been killed during the proscription hence their low numbers within the Clan database. The known aliases used by some within the 9-9 kindred are Graham for kit 191228, McPherson for kit 121048 and Murray which was used by the Bega (NRM kit 16198) family.

McGrigor in Lanarkshire.

Matt McGregor (Kit 191228) respectively is the closest non immediate related DNA sample to the (McGregor Bega Kits) [NRM] and also to McPherson (Kit 121048) suggesting that all three have a high probability of being from the same family and from the discussion above, therefore most likely Glenstrae. Importantly the DNA data suggests that the common ancestor is likely to be Alaster Ruadh MacGregor. (The DNA suggests ‘time to common ancestor’ is 390 - 450 years respectively for all 3 Kits). This suggests that the individuals are likely to have arisen from different sons of that Chief (Alaster Ruadh 1524-1547).

[detail omitted which will be in the full report when published]

Of all the possibilities, when considering the actual family history, traditions and marriage connections, etc., the data at hand suggests Kit 191228 is most likely a descendant of Ewin the Tutor and probably from Kilmannan himself through the son Hugh. A more detailed report on this will be submitted later, in conjunction with the history of Kilmannan and Rob Roy that to date seems not yet to have been posted. In support of this contention Kit 191228 has 4 separate Autosomal DNA matches with known descendants of Kilmannan’s via his female line.

As Kit 191228 McGrigor and Kit 121048 McPherson both match the Bega McGregor [NRM] kits, with the DNA data being supported by the documentary evidence, the conclusion is that the Bega [NRM] McGregors are also Alaster Gald - Glenstrae descendants.

Neil McGregor has summarized this and other data analysed in a revision of the chart published in last year’s blog.

Fig 3: Proposed Clan Gregor descendancy chart as at end 2015



Just to repeat the invitation to request comparisons with other participants stated above: please state the group or individuals with whom you wish to be compared and I will help you interpret the results. Please note that it is only feasible to compare like with like (i.e. 67 markers with 67, 37 with 37). As usual my email address is richardmcgregor1ATyahoo.co.uk (substitute @ for AT).

Richard McGregor January 2016

Can Blog followers please note this message from Google:

In 2011, we announced the retirement of Google Friend Connect for all non-Blogger sites. We made an exception for Blogger to give readers an easy way to follow blogs using a variety of accounts. Yet over time, we’ve seen that most people sign into Friend Connect with a Google Account. So, in an effort to streamline, in the next few weeks we’ll be making some changes that will eventually require readers to have a Google Account to sign into Friend Connect and follow blogs.

As part of this plan, starting the week of January 11, we’ll remove the ability for people with Twitter, Yahoo, Orkut or other OpenId providers to sign in to Google Friend Connect and follow blogs. At the same time, we’ll remove non-Google Account profiles so you may see a decrease in your blog follower count.

We encourage you to tell affected readers (perhaps via a blog post), that if they use a non-Google Account to follow your blog, they need to sign up for a Google Account, and re-follow your blog. With a Google Account, they’ll get blogs added to their Reading List, making it easier for them to see the latest posts and activity of the blogs they follow.

We know how important followers are to all bloggers, but we believe this change will improve the experience for both you and your readers.

Posted by Michael Goddard, Software Engineer