Citing Ancestry DNA tests

DNA logo on top of school photo circa 1964
DNA logo on top of school photo circa 1964

Introduction

At the last VicGUM meeting a few of us had a chat about citing our Ancestry DNA test results. I came away feeling that we hadn’t quite nailed it. There wasn’t that elegance of simplicity that happens when a solution to an issue really hits the mark. I didn’t want to make a bad decision. This is sources we were talking about and I don’t want to revisit my decision any time soon. I’ve been doing some more thinking because the voices in my head keep saying What you are thinking is not good. It’s going to be a lot of work.
This discussion about citing DNA tests came at a time when my planets are aligning. A cousin had written to say she is updating her branch of the family. She asked what current Information I have. I’m not sure. Another writes as he unravels his part of the family – or has it become more tangled than ever? Another wrote of his continuing interest in family history. I suppose it is no surprise that the grandchildren of the men who kept shearing tally books still have an interest in local and family stories more than a century later.
All this has come at a time when I too am reviewing the information I have and how it fits together. Have I got it arranged to take the best advantage of the DNA tests we have? Do I have a timeline organised as a starting point for weaving tales. It’s only seven years since I last reviewed a lot of my data but a lot has happened since. More people are researching family and local history and there’s still more information becoming available.
And there is DNA. It’s now the time of year to start thinking about which DNA tests, if any, I should purchase in November.

DNA Citations

I like to begin adding material to my Legacy Family Tree family file by adding the basic source information first. Then it is ready for linking to the rest of the data as I go along. Our VicGUM discussion about citing Ancestry DNA tests was very timely.
Let me start with the conclusions of my deliberations. (My post decision justification will come in a later post.)

Citing an Ancestry DNA test

This is a way to cite my brother’s AncestryDNA test:

Ancestry, DNA test for John Baulch
https://www.ancestry.com.au/dna/insights/D8A89B39-CC28-45CD-AAB4-2B46C4D0341E 

Don’t be frightened by these long web addresses. Pick them up from the url toolbar at the top of the relevant page and just paste them into the citation. And I can pick them up and paste back into the url toolbar when I want to return to the source. No searching through pages and pages of DNA results required!

There are three web pages I can navigate to from John’s DNA test page

•    his ethnicity estimate
•    his DNA matches (including his shared matches)
•    his DNA Circles.

Citing an Ancestry DNA Ethnicity Estimate

A citation for an ethnicity estimate or DNA story can be created the same way. I’ve just added a little more information. Importantly, notice that the web address or url changed:

Ancestry, DNA test for John Baulch, (accessed 7 Sep 2018), ethnicity estimate,
https://www.ancestry.com/dna/origins/D8A89B39-CC28-45CD-AAB4-2B46C4D0341E?o_iid=90600&o_lid=90600&o_sch=Web%20Property

Because DNA is such an evolving field it is probably critical to include the access date. These changes are most notable with Ancestry’s ethnicity estimate and I think John’s ethnicity estimate sits in limbo like Kitty Cooper’s did at the time of writing her last blog. 
 
Citing an Ancestry DNA match

This is an example for an Ancestry DNA match:

Ancestry, DNA test for John Baulch, (accessed 7 Sep 2018), match with Henry Davenport,
https://www.ancestry.com.au/dna/tests/D8A89B39-CC28-45CD-AAB4-2B46C4D0341E/match/5164B8C8-CBC9-48A3-8F0B-8A9DC8E23F0D?filterBy=ALL&sortBy=RELATIONSHIP&page=1

It is also the same web address for shared matches so a citation to bring attention to shared matches might look like this:

Ancestry, DNA test for John Baulch, (accessed 7 Sep 2018), match and shared matches with Henry Davenport, https://www.ancestry.com.au/dna/tests/D8A89B39-CC28-45CD-AAB4-2B46C4D0341E/match/5164B8C8-CBC9-48A3-8F0B-8A9DC8E23F0D?filterBy=ALL&sortBy=RELATIONSHIP&page=1

Citing Ancestry DNA Circles

This is an example for an Ancestry DNA Circle:

Ancestry, DNA test for John Baulch, (accessed 7 Sep 2018), DNA Circle for John’s second great grandfather Francis Baulch
https://www.ancestry.com.au/dna/tests/D8A89B39-CC28-45CD-AAB4-2B46C4D0341E/evidence/HZ5F6NXG?returnPage=circles

It’s not so long ago that we struggled to get DNA circles going. Now there are 38 members in the Francis Baulch DNA Circle. I wonder how many there will be by Christmas this year? 100? I really do need take time out review the information in my Ancestry Family Tree.

Report Bibliography

The first part of creating a citation is to describe WHAT the source is:

•    a book
•    a newspaper
•    a parish register
•    a personal communication
•    a website (including a web page for an Ancestry DNA test)

Here are some examples:

Bishop, Les; The Thunder of the Guns!: A History of 2/3 Australian Field Regiment (Sydney: 2/3 Australian Field Regiment Association, 1998)

(Melbourne) The Herald

St Peter and St Paul’s Church of England (Muchelney, Somerset, England), Parish Registers 1702-1997

Personal Knowledge of Alexander Learmonth (1809-1874)

Stephen Luscombe, The British Empire: Where the Sun Never Sets (https://www.britishempire.co.uk/)

The National Archives of the UK. “TNA WO 392 Prisoners of War Lists, Second World War.”

Ancestry, DNA test for John Baulch https://www.ancestry.com.au/dna/insights/D8A89B39-CC28-45CD-AAB4-2B46C4D0341E

Report Citations

A full citation generally requires more information than just WHAT the source is. I need to know WHERE precisely in the source is the location of the evidence I am relying upon to tell my family story or to construct my family tree:

• the page in a book
• the page and column in a newspaper
• the page and/or date in a set of parish registers
• the date and correspondents on a letter
• the web address, or url, for a website
• a match url for an Ancestry DNA match

Here are some examples:

Bishop, Les, The Thunder of the Guns! A History of 2/3 Australian Field Regiment (Sydney: 2/3 Australian Field Regiment Association, 1998), p266

Poets and War, (Melbourne) The Herald, 1 Feb 1947, p 12, col 7; accessed in The National Library of Australia http://nla.gov.au/nla.news-article245867791

St Peter and St Paul’s Church of England (Muchelney, Somerset, England), Parish Registers 1702-1997, accessed in South West Heritage Trust: Somerset Archives & Local Studies; Somerset, England, Church of England Baptisms, Marriages, and Burials, 1531-1812 at www.ancestry.com.au; baptism of Hannah Baulch, Nov 1761

Personal Knowledge of Alexander Learmonth (1809-1874), letter to his brother William dated 28 Nov 1856, J W Baulch Personal Collection

4th Dragoon Guards (http://www.britishempire.co.uk/forces/armyunits/britishcavalry/4dg.htm)

The National Archives of the UK. “TNA WO 392 Prisoners of War Lists, Second World War” accessed in UK, Prisoners of War 1939-1945 index at ancestry.com.au and images at fold3.com, entry for VX114, Lieutenant John Noel Learmonth

The Archives of the UK. “TNA WO 392 Prisoners of War Lists, Second World War.” accessed in UK, Prisoners of War 1939-1945
index at ancestry.com.au and images at fold3.com, entry for WX3326, Lieutenant Colonel Leslie Le Souef

(Yes, I am cheating here. The important part is that I start in Ancestry and finish in Fold3 with an image. It just looked too frightening here to put both web addresses in the one citation.)

Ancestry, DNA test for John Baulch, (accessed 7 Sep 2018), ethnicity estimate, https://www.ancestry.com/dna/origins/D8A89B39-CC28-45CD-AAB4-2B46C4D0341E?o_iid=90600&o_lid=90600&o_sch=Web%20Property

Ancestry, DNA test for John Baulch, (accessed 7 Sep 2018), match and shared matches with Henry Davenport, https://www.ancestry.com.au/dna/tests/D8A89B39-CC28-45CD-AAB4-2B46C4D0341E/match/5164B8C8-CBC9-48A3-8F0B-8A9DC8E23F0D?filterBy=ALL&sortBy=RELATIONSHIP&page=1

Ancestry, DNA test for John Baulch, (accessed 7 Sep 2018), DNA Circle for John’s second great grandfather Francis Baulch https://www.ancestry.com.au/dna/tests/D8A89B39-CC28-45CD-AAB4-2B46C4D0341E/evidence/HZ5F6NXG?returnPage=circles

It has taken a little while to place information in the appropriate fields in my Legacy Family Tree Master Source List and the associated Source Detail item but I do like the result.

For myself.

Ancestry and Fold3 are subscription based so access is through a subscription to access the Prisoners of War information.
Only John and I (as his manager) have access to his DNA test so others should not be able to use these links to access information. I hope! Let me know if you are able to.
My post decision justification – or how I arrived at these examples – will appear in a blog in a little while. It will include my thoughts on the principles of creating a family tree and my reasons for not using Dates or Events with DNA information.

I hope these examples are of some help.


Bottom line, whether you use these examples or not, do save information about sources so you know what you used and where it is so you can return to it when required. 

Census records – one of my gateway sources


I call some of the sources I use my gateway sources. I find them critical to breaking down brick walls. Do I stand at the gateway afraid to go any further? Do I stand in the open gateway thinking about how to approach a completely new set of sources that may contain family stories?
Passenger lists are one of my gateway sources. Before a family member embarks on their journey to Australia I focus on British sources. Once a family member arrives in Australia I search for my family stories here in Australia.
Census records, particularly those that form part of the 1841 English census collection, are one of my favourite gateway sources. They set a point in time for setting aside Australian collections and turning to English collections. Furthermore, information contained in an 1841 England census record may confirm information I already have or may give some clues about which other English collections I should look at.
For example, the 1841 England census records are pivotal in telling the story of my paternal two greats grandfather Francis Baulch and his wife Ann Bowles. The census records establish that the family was still living in Pitney, Somerset at census time. The census records also contain hints as to why the family emigrated to Tasmania with other Pitney, Somerset families not long afterwards.
There is no doubt that Francis’s family was in dire straits by 1841. As were many such families following the enclosures in the area several years beforehand. The Pitney churchwardens were concerned about the debt owed to them by Francis’s mother. Francis couldn’t help. He had a young and growing family to provide for. And Francis had difficulty getting sufficient work to sustain his own family let alone help his mother in her difficulties. One year he did manage to win the contract for hauling stone for the roads but was unable to retain the contract. Francis’s brother, Enoch, in common with many other young agricultural laborers, also had difficulty in obtaining work. And when he did have work Enoch was paid a pittance.
The 1841 England census was held on the 6th of June. It was summer harvest time and may well have been one of those times that Enoch Baulch had work. It’s likely that Enoch was one of the unnamed men recorded in the census as living in sheds.
The Baulch men, and other men like them, would have been receptive to Henry Dowling’s search for experience agricultural laborers in 1840/1841. Tasmanian farmers had appointed Dowling as their agent in the farmers search for workers to replace men who had left Tasmania for the opportunities in the new Port Phillip district.
In the autumn following the 1841 Census the Pitney churchwardens gave Francis Baulch and Charles Bartlett, both with young families to support, funds to purchase clothing and other necessities to help them emigrate. By late November 1841, the two men, their families and some closely connected families sailed for Tasmania. They were avoiding facing another bleak winter in Pitney.
But some family members didn’t come. The census records give clues as to why.
For example, Francis’s brother William Baulch was living next door to his mother at the time. No doubt to help his mother when needed. His mother remarried in 1845 so William and his family was then free to emigrate. There is a clue there in the 1841 census records that helped find William’s new home. In 1841 William Baulch and Martha Cook had a ten-year-old boy, Edward or Edmond Perrin, staying with them. There they all are emigrating to the United States in 1850 and can be followed in the US censuses from thereafter.
Others weren’t of the right age or otherwise not qualified for assistance to emigrate. Some of the children later emigrated with many of Henry Baulch’s descendants emigrating to Queensland.
Charles Edgar, one of Ann Bowles’ younger half brothers, went to Ontario, Canada.
Frances-Fletcher-TreeWhich brings me to a source that I think may become another of my gateway sources. I have a DNA autosomal match with a Canadian cousin. On my side of our family tree the match comes about because I am a descendant of Henry Bowles and Frances Fletcher, Ann Bowles’s parents. On the other side of our family tree the match comes about because my Canadian cousin is a descendant of William Edgar and Frances Fletcher, Charles Edgar’s parents. The ancestor we have in common is Frances Fletcher. The chromosome segments where we match, therefore, must have been passed down from Frances Fletcher. But which segments on which chromosomes?

Selected Bibliography:
The National Archives (TNA): HO 107/955 f4 p1 Census Returns: 1841
Canada Census 1851 -1861 [database ] www.familysearch.org
United States Census, 1860 – 1870, [database & images] www.familysearch.org
St John the Baptist Church of England (Pitney, Somerset, England). Parish chest material.
AncestryDNA [database]. www.ancestry.com.au.

Autosomal DNA and Probability

The general wisdom is that matches on autosomal DNA are only accurate for up to four or five generations (or to second cousins). Beyond this limit any matches that may occur probably occur by chance, not by inheritance. This is because there is always the probability that any match of any kind of 5% or less can be attributed to random chance and not to inheritance.
My purpose here is to suggest that, by referring to our traditional written family history research and by careful planning our DNA tests, we may be able to identify matches way beyond our great grandparents and our second cousins.

My Ancestors
My Ancestors

I have two parents. It is expected that I receive half or 50% of my autosomal DNA from my father and half from my mother. This seems to be an acceptable proposition.
I have four grandparents. It is expected that I receive one quarter or 25% of my autosomal DNA from each of my grandparents. That is, it is expected that I received 25% from my grandfather Bert Baulch, 25% from my grandmother Annie Abbey, 25% from my grandfather Noel Learmonth and 25% from my grandmother Edith Salter.
I have eight great grandparents. It is expected that I received one eighth or 12.5% of my autosomal DNA from each of my eight great grandparents.
At the fifth generation it is expected that I received one sixteenth or 6.25% of my autosomal DNA from each of my two great grandparents. Can the expected values for receiving autosomal DNA from my two greats grandparents definitely be attributed to inheritance? After all, the upper mark of 5% which is used to indicate matches that may be wholly attributed to chance is not all that far removed from the 6.25% that may be attributable to inheritance from one of my two greats grandparents.
Now none of my direct ancestors are alive and so aren’t available for DNA testing. I have to rely upon my siblings and upon my cousins. The expected values of a match on autosomal DNA tests for my ancestors, siblings and cousins can be summarised in tabular form as follows:
Relationship-Chart
The expected value of sharing autosomal DNA with one of my siblings is 50%. I actually share 38% autosomal DNA with one of my brothers. The expected value for shared autosomal DNA with any one of my first cousins once removed is 6.25%. I actually share 7.3% autosomal DNA with one first cousin once removed and 5.4% with another.
Should actual values that differ from expected values be cause for concern? Absolutely not!
However, rather than accepting the relationship for any autosomal DNA match by a testing company as being set in stone, I do believe that my written genealogy confirms the autosomal DNA match result. Equally, the autosomal DNA match is a further independent source that may substantiate my written genealogy. The two are not separate but dependent one upon the other.
The methodology for calculating the likelihood of what autosomal DNA we are expected to have should be familiar to us all.
Consider tossing a coin. The first toss may be heads. The probability of the second toss being heads is still 50%. Even if the second toss is heads the probability of the third toss being heads is still 50%. Thus in a small population of 3 tosses the result of three heads doesn’t indicate that a double headed coin is being used. However, if the result still remains heads after hundreds or thousands of tosses I might be inclined to check whether the coin is biased in some way. According to Bernoulli’s theorem, the more a coin is tossed the more likely it is that the actual value of the number of times a head is tossed approaches the expected value of 50%.
Now consider throwing a die or dice. The first toss may be a 4. The probability of throwing a 4 is one sixth. Indeed for an unbiased die the probability of throwing one of the six numbers is always one sixth irrespective of the previous throws. For a short number of throws there may be a run on a particular number but this in no way alters the probability for the next throw of the die. For each number that probability is one sixth. As for the coin toss, over hundreds and thousands of throws of the die the actual value over all of these throws will approach the expected probability of one sixth for each of the six numbers on the die.
This method of calculating expected values for the toss of a coin and the throw of a die can be applied to the passing of autosomal DNA from two parents to a child. The options for a toss of a coin are either heads or tails. The options for the throw of a die are 1, 2, 3, 4, 5 or 6. The options for a child are that the child receives its autosomal DNA half from its father and half from its mother. As for the coin and as for the die the actual value of autosomal DNA received in the short term may differ from the expected value. As for the coin and as for the die over millions and indeed billions of generations the actual value of autosomal DNA a child receives from its parents will approach the expected value of 50% from its father and 50% from its mother.
But is this so? What is it that Family Tree DNA and AncestryDNA testing with respect to autosomal DNA? Is there an equal chance of this autosomal DNA information coming from one parent as from the other parent? Let’s start by looking at DNA in the whole cell before focusing on autosomal DNA.
Each cell in our body contains DNA. In the cell proper DNA can be found in the mitochondria. This DNA is known as mitochondrial DNA. DNA is also found in the cell nucleus which contains 23 pairs of chromosomes each containing DNA. The 23rd pair is known as pair of the sex chromosomes. The 23rd pair for men is made up of one X chromosome and one Y chromosome. Women have 2 X chromosomes.  The first 22 pairs of chromosomes are known as autosomes. Autosomes contain autosomal DNA.

Cell with nucleus and mitochondria
Cell with nucleus and mitochondria

In a search for genealogical DNA the testing companies test in excess of 700,000 markers on the “junk” DNA portion of our autosomal chromosomes. These markers are the sites of single nucleotide polymorphisms or SNPs (pronounced snips). A person’s autosomal SNPs can be identified and compared another person’s autosomal SNPs.
Apart from identical twins, each of us is unique. We see this as we walk down the street or glance around a football crowd at the MCG. It is easy, therefore, to apply the law of large numbers as discussed above to the more than 700, 000 SNPs. To me 700,000 seems to be a large number. Surely, for each marker or SNP there is a 50% chance that I inherited that SNP from my father and a 50% chance that I inherited that SNP from my mother. Surely, as with the coin and the die, I had an equal chance of receiving each marker independent of the previous marker and the marker following.
There are two difficulties with this assumption.
Firstly, autosomal DNA tests are not able to distinguish which markers I inherited from my father and which I inherited from my mother.
Secondly, if the first wasn’t a knockout blow, the markers are set out on a strand of DNA. Unlike each toss of a coin or each throw of a die, whether or not I inherit a marker from my father or from my mother is not independent of who I inherited the previous marker from or who I inherited the next marker from. That is, the 700,000 SNPs are linked along the DNA strand. For example, the autosomal DNA I share with my brother on chromosome 3 and which we must have inherited from our father or our mother or a combination of both occurs along most of the chromosome.

Chromosome 3
Chromosome 3

Now we don’t match along the whole of chromosome 3 but where we do match it is mostly in one long strand. Indeed, the longer the strand we share the more closely is our predicted relationship.
Consider a little. This phenomenon of linked markers has helped me detect relationships beyond those predicted by chance – beyond our great grandparents and our second cousins. For example, I have confirmed a relationship with a third cousin twice removed as well as – wait for this – a sixth cousin twice removed! These results are quite beyond my great grandparents and second cousins (that is second cousins without any removes).
DNA testing for family historians is still in its infancy. The databases of results are still very small. Nevertheless I think I can apply traditional genealogical research techniques to my DNA research:

  • DNA is no substitute for quality traditional genealogical research. Sad to say but true.
  • I have started my analysis with an autosomal DNA test and started with myself. Then I moved from my closer relations to my more distant relations.
  • I have tried to optimise my chances of detecting matches by including a family tree of my ancestors and of the names of my ancestors were possible.
  • I have uploaded my information to Gedmatch as some family have tested on Family Tree DNA and some on AncestryDNA. My challenge now is to encourage our family to also share their results by uploading to Gedmatch (www.gedmatch.com) especially those who have tested with AncestryDNA for AncestryDNA has no facility to examine results  (for those who tested with AncestryDNA go to Settings and download the raw DNA data. Create a Gedmatch account and follow the instructions for uploading to Gedmatch. BE WARNED! These raw files are very, very large and take quite some to download and upload).
  • It will involve some of that boring work that doesn’t seem to yield any exciting results but I suspect that it may be worthwhile in the long term to examine my results down to the 1centiMorgan level and by each chromosome. I see this as akin to searching through parish registers or census results.

Y-DNA Baulch

Cell showing nucleus and mitochondria
Cell showing nucleus and mitochondria

There are so many genealogical collections readily available these days it is tempting to try them all. Without thought or regard as to a collection’s relevance to the particular information sought. Those collections that are at hand are accessed first. Never mind the other 95% of collections which have yet to be digitised or indexed. It is easy to tap a key and search for the information online when I really do know in my head that my searching would be more productive if only I travelled to archives on the other side of the world or just spent time searching painstakingly through films and microfiche nearer to home.
But where to start searching further for my three greats grandmother Mary, wife of George Watts? I have found her in two English census returns indicating that she may have been born a British subject in foreign parts. Foreign parts? Where to begin?
I asked my cousin Val whether she would indulge my curiosity and undergo a DNA test. She kindly obliged. It was not until Val’s results arrived that I realised how little I know about DNA and today’s genetics. I was lost to Mendelian genetics when dominant brown eyes and recessive blue eyes were discussed. Where did that leave my hazel eyes? So the current genealogical literature about DNA seemed to me to be riddled with scientific terms that still leave me confused. I guess there is just so much to absorb that my little brain has been in overload for quite some time now.
Should I have done the more traditional or paper genealogical research that I had been avoiding before I set out on my DNA journey? Definitely. In a way my avoidance of a little hard work has voided the DNA results received – at least for the time being.
Val’s results have sent me back to reassess my research strategy and use of DNA as a research tool. But my brother John’s results are more promising if not equally confusing. So I am using John’s results as a medium for gaining an understanding of DNA analysis for genealogists.
John and I can trace out ancestry back on our paternal side to a Charles Baulch who married Ann Biddlecombe on 1 April 1799 at Muchelney, Somerset, England. On reviewing the information I agree with my sister. She says that because she couldn’t find the death of Charles Baulch in the civil indexes she concluded that he must have died before civil registration began in 1837. That doesn’t mean Charles Baulch died in 1836 and indeed our best guess is that Charles died between the time the Muchelney churchwardens wondered what to do with Baulch’s children and the time shortly later when their concern focused on what to do with Ann Baulch’s children.
We also have a dilemma about when our ancestor Charles Baulch was born. Certainly a Charles Baulch was born in Muchelney on 25 January 1767 to Roger Balch and Betty Gaylard. However, a Charles Baulch was buried just over a month later on 8 March 1767 in Muchelney and the infant son of Roger Balch seems to be the only candidate for this burial. So who married Ann Biddlecombe on 1 April 1799?
The obvious course of action is to search neighbouring parishes for a suitable Charles Baulch – fanning out to further parishes if necessary. Fortunately there is a copy of Dr Campbell’s index to baptisms and marriages for Somerset held on microfilm at the Genealogical Society of Victoria and indexes for many Somerset parishes now available on FreeReg  so I have a deal of work to do searching through these two sources available to me without having to travel the world.
Meanwhile, until I am able to motivate myself to do this paper genealogy is there anything in the analysis of John’s DNA that catches my attention? Maybe.
There are three parts to the analysis of John’s DNA. The first part involves analysis of his Y chromosome. The human cell contains a nucleus which includes 46 chromosomes. The first 44 are paired but the last two form the sex chromosome. A male has one Y chromosome and one X chromosome. For a male they receive their Y chromosome from their father who receives his Y chromosome from his father and so on. That is, the surname and the Y chromosome follow the paternal line.
In particular my brother received his Y chromosome from our father who received it from his father (our grandfather) who received it from his father, Samuel Baulch who received it from his father Francis Baulch who received it from Charles Baulch, our three greats grandfather. And there our paper genealogy trail finishes for the moment. But who did Charles Baulch receive his Y chromosome from?
Two tests are performed on the Y chromosome. In the first test short segments of DNA (markers) are measured and the number of repeats, short tandem repeats (STRs) are recorded. These results form an individual’s haplotype.

DNA strand
DNA strand

The second test examines particular points on the Y chromosome looking for mutations or single nucleotide polymorphisms (SNPs). That is the particular point is examined to see whether an instance of adenine, thymine, cytosine or guanine has mutated to one of the other three. Paternal lineages may be constructed for the Y chromosome using these mutations as nodes in the paternal lineages.
The results from both tests for Y-DNA analysis predict which haplogroup an individual belongs. John, for example, belongs to haplogroup I-M253 based on analysis of his Y-DNA. And while the database is still small there are also several Baulchs that belong to this haplogroup including many who can trace their ancestry back to Somerset. But many generations earlier than I have been able to establish our genealogy.
There is still a great deal of research to be done.