GENETIC VARIANTS ASSOCIATED WITH HUMAN-DIRECTED HYPER-SOCIAL BEHAVIOR IN DOMESTIC DOGS
1. A method for predicting the probability of a canine exhibiting a sociable behavior comprising:
- (a) genotyping a biological sample from a canine;
(b) counting the number of structural variants within the Williams-Beuren Syndrome (WBS) locus on canine chromosome 6; and
(c) predicting the probability of the canine exhibiting a sociable behavior based on the number of structural variants.
Disclosed herein are structural variants in the Williams-Beuren Syndrome locuse of the dog genome that are associated with hyper-social behavior in dogs relative to wolves, and that are informative regarding the nature of social behavior in dogs. Disclosed also is a commercial test with these loci as indicators along the spectrum of sociality. Methods of breeding dogs to select for dogs having increased sociability are also disclosed.
- 1. A method for predicting the probability of a canine exhibiting a sociable behavior comprising:
(a) genotyping a biological sample from a canine; (b) counting the number of structural variants within the Williams-Beuren Syndrome (WBS) locus on canine chromosome 6; and (c) predicting the probability of the canine exhibiting a sociable behavior based on the number of structural variants.
- 2. A method of ranking dogs or wolves according to their likely level of exhibiting a sociable behavior comprising:
(a) obtaining a biological sample from a first dog or wolf; (b) determining the number of structural variants within the Williams-Beuren Syndrome (WBS) locus on chromosome 6 of the first dog or wolf; (c) obtaining a biological sample from a second dog or wolf; (d) determining the number of structural variants within the Williams-Beuren Syndrome (WBS) locus on chromosome 6 of the second dog or wolf; and (e) ranking the first dog as being more likely to exhibit a sociable behavior than the second dog if the number of structural variants determined in step (b) is greater than the number of structural variants determined in step (d);
(f) ranking the second dog as being more likely to exhibit a sociable behavior than the first dog if the number of structural variants determined in step (d) is greater than the number of structural variants determined in step (b).
- View Dependent Claims (3, 4, 5, 6, 7, 8, 9, 10, 11)
- 12. A method of screening a dog or wolf library comprising:
(a) obtaining a genomic library from a dog or wolf that contains the Williams-Beuren Syndrome (WBS) locus on canine chromosome 6; (b) determining the number of structural variants in the WBS locus.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
- 25. A method of producing dogs that are more likely to exhibit a sociable behavior comprising:
(a) selecting a male and female dog for breeding that each are known to have at least one structural variant within Cfa6.6, Cfa6.7, Cfa6.66, or Cfa6.83 in the Williams-Beuren Syndrome (WBS) locus; and (b) mating the dogs of step (a) to produce offspring.
- View Dependent Claims (26, 27)
- 28. A method of editing the genome of a dog comprising:
(a) obtaining a dog; (b) using clustered regularly interspaced short palindromic repeats (CRISPRs)/CRISPR-associated (Cas) 9 to inactivate a gene in the Williams-Beuren Syndrome (WBS) locus on canine chromosome 6.
- View Dependent Claims (29)
- 30. A kit for detecting the presence of structural variants within the Williams-Beuren Syndrome (WBS) locus of canines comprising one or more primers selected from the group consisting of:
This application claims priority from U.S. Provisional Patent Application Ser. No. 62/527,653, filed Jun. 30, 2017, the content of which is hereby incorporated by reference, in its entirety.
This invention was made with government support under Grant No. GM086887 awarded by the National Institutes of Health, and Grant Nos. DEB-1245373 and DMS-1264153 awarded by the National Science Foundation. The government has certain rights in the invention.
Although considerable progress has been made in understanding the genetic basis of morphologic traits (e.g., body size, coat color) in dogs and wolves, the genetic basis of their behavioral divergence is poorly understood. While decades of research have focused on the unique relationship between humans and domestic dogs, the role of genetics in shaping canine behavioral evolution remains to be elucidated. Existing hypotheses on the behavioral divergence between dogs and wolves posit that dogs are more adept at social problem solving (1) due to an evolved human-like social cognition (2,3). However, mounting evidence suggests that human-socialized wolves can match or exceed the performance of domestic dogs across these socio-cognitive domains (4). Empirical demonstrations remain robust that dogs display exaggerated gregariousness, referred to as hyper-sociability, which is a heightened propensity to initiate social contact that is often extended to members of another species, when compared with wolves into adulthood. Hyper-sociability, one facet of the domestication syndrome (5), is a multifaceted phenotype that may include extended proximity seeking and gaze (6, 7), heightened oxytocin levels (6), and inhibition of independent problem solving behavior in the presence of humans (8). This behavior is likely driven by behavioral neoteny, which is the extension of juvenile behaviors into adulthood and increases the ability for dogs to form primary attachments to social companions (4).
Due to strict selective breeding rules, distinct dog breeds conform to a predictable phenotype. It is this population structure and isolation that presents the dog as a powerful model for exploring the genetic underpinnings of complex traits such as behavior (9). Many dog breeds have been collectively scored using standardized tests for behavioral personality traits central to their domesticated nature (e.g., playfulness, sociability, aggression, trainability, curiosity or boldness) and breed-specific function (e.g., herding, pointing, chasing, working) (9, 10-17). Though there has been strong selection for breed conformation, inter-individual variation contribution to heritability estimates suggests that genetics plays a detectable role in shaping canine social behavior (18).
Phenotype evolution in the dog genome during the divergence process of dogs from wolves during domestication has been investigated through a genome-wide association scan of over 48,000 SNP genotypes from 701 dogs from 85 breeds, and 92 gray wolves with a Holarctic distribution (19). Using divergence, the top ranking outlier site was located within SLC24A4, a gene known to contain polymorphisms linked to eye and hair color variation in humans (19). The second ranking site was located within WBSCR17, a gene implicated in Williams-Beuren Syndrome (WBS) in humans. WBS is a neurodevelopmental disorder caused by a 1.5-1.8 Mb hemizygous deletion on human chromosome 7q11.23 spanning approximately 28 genes (20). This syndrome is characterized by delayed development, cognitive impairment, behavioral abnormalities, and hyper-sociability (21-23). A number of other studies have taken a different approach and targeted genes linked to social behavior in other taxa. For example, targeted variation was surveyed in the dopamine receptor D4 and tyrosine hydroxylase, both genes extensively studied for their roles in the primate brain'"'"'s reward system (24). The study found an association between longer repeat polymorphisms with lowered activity and impulsivity in a limited survey of breeds. In a similar approach, variation surveyed at a regulatory SNP in the oxytocin receptor gene, also known to influence human pair bonding, was found to be associated with proximity seeking and friendliness in two dog breeds (25). However, behavioral genetic studies are still plagued with the challenge to understand the genetic architecture of nearly every facet of a complex behavior.
Disclosed herein are methods of identifying dogs or wolves with predispositions for hyper-social behavior, e.g., human-directed hyper-social behavior. The methods involve identifying structural variants at specific genetic loci within the Williams-Beuren Syndrome (WBS) locus on chromosome 6 of the dogs or wolves. In some embodiments, the structural variants include at least one of Cfa6.6, Cfa6.7, Cfa6.66, or Cfa6.83. In some embodiments, the structural variants include at least one of the genes GTF2I, GTF2IRD1, and WBSCR17.
Accordingly, disclosed herein is a method for predicting the probability of a dog or wolf exhibiting a sociable behavior comprising:
- (a) genotyping a biological sample from a dog or wolf;
- (c) predicting the probability of the dog or wolf exhibiting a sociable behavior based on the number of structural variants.
The disclosure herein allows for improved methods of ranking dogs or wolves according to their sociability. Thus, disclosed herein is a method of ranking dogs or wolves according to their likelihood of exhibiting a sociable behavior comprising:
- (a) obtaining a biological sample from a first dog or wolf;
- (c) obtaining a biological sample from a second dog or wolf;
In some embodiments, the biological sample is blood, saliva, cerebrospinal fluid, skin, or urine.
In some embodiments, genotyping the biological sample includes PCR amplification and agarose gel electrophoresis. In some embodiments, the genotyping utilizes at least one primer selected from the group consisting of:
In some embodiments, the structural variant is a transposable element that interrupts a gene in the WBS locus. In some embodiments, the transposable element is a retrotransposon. In some embodiments, the retrotransposon is a short interspersed nuclear element (SINE) or a long interspersed nuclear element (LINE).
In some embodiments, the method identifies at least one structural variant that occurs within at least one gene selected from the group consisting of GTF2I, GTF2IRD1, and WBSCR17.
In some embodiments, the social behavior is selected from the group consisting of attentional bias to social stimuli (ABS), hyper-sociability (HYP), and social interest in strangers (SIS).
In some embodiments, the methods disclosed herein include counting structural variants found at Cfa6.6, Cfa6.7, Cfa6.66, and Cfa6.83.
Disclosed herein is a method of screening a dog or wolf library comprising:
- (b) determining the number of structural variants in the WBS locus.
In some embodiments, the location of the structural variants is also determined.
In some embodiments, step (b) comprises determining the number of structural variants in at least one of GTF2I, GTF2IRD1, and WBSCR17. In some embodiments, step (b) comprises determining the number of structural variants in all of GTF2I, GTF2IRD1, and WBSCR17.
In some embodiments, step (b) comprises the use of the polymerase chain reaction (PCR) to amplify at least one DNA fragment from the WBS locus. In some embodiments, the DNA fragment comprises at least one of the loci Cfa6.6, Cfa6.7, Cfa6.66, or Cfa6.83.
In some embodiments, step (b) comprises the use of PCR to amplify the locus Cfa6.6 using the primers CCCCTTCAGCCAGCATATAA (SEQ ID NO: 1) (forward) and TTCTCTGGGCTGTCTGGACT (SEQ ID NO: 2) (reverse).
In some embodiments, step (b) comprises the use of PCR to amplify the locus Cfa6.6 using the primers AAGTTTCTCTGATGGAAAACACA (SEQ ID NO: 3) (forward) and GGTGGCTGGAAATTTCAGTAG (SEQ ID NO: 4) (reverse).
In some embodiments, step (b) comprises the use of PCR to amplify the locus Cfa6.7 using the primers TGGAGCCATGATTAGGAAGG (SEQ ID NO: 5) (forward) and TAAGGAAGGACCCCATTTCC (SEQ ID NO: 6) (reverse).
In some embodiments, step (b) comprises the use of PCR to amplify the locus Cfa6.66 using the primers TGCTGCTTCATGTTCTGTGA (SEQ ID NO: 7) (forward) and TGGTGCATTAGCTTTGGTTG (SEQ ID NO: 8) (reverse).
In some embodiments, step (b) comprises the use of PCR to amplify the locus Cfa6.83 using the primers AACCACAGGAACAAAACCTCA (SEQ ID NO: 9) (forward) and CCTCCTGTTGGACATTTGGA (SEQ ID NO: 10) (reverse).
In some embodiments, step (b) comprises the use of agarose gel electrophoresis to identify DNA fragments from the WBS locus that have altered mobility compared to the corresponding fragments from the dog reference genome and that are indicative of structural variants in the WBS locus from the library.
In some embodiments, step (b) comprises a hybridization step using at least one probe from the WBS locus that identifies structural variants in the WBS locus. In some embodiments, the hybridization step comprises fluorescence in-situ hybridization (FISH).
Also disclosed herein are canine breeding methods. The methods disclosed herein that allow for the prediction of sociability characteristics of canines permit breeders to select those canines for breeding that have desirable sociability characteristics. That is, by choosing canines for breeding that contain appropriate structural variants of the WBS locus, and by not choosing for breeding those canine that do not contain those variants, breeders can increase the likelihood that offspring will exhibit desirable sociability characteristics such as attentional bias to social stimuli (ABS), hyper-sociability (HYP), and social interest in strangers (SIS).
Over time, this can lead to the development of breeding lines of canines that are more suitable for certain roles; e.g., canines that are better family pets, because they are more attached to their owners. Similarly, undesirable traits such as aloofness or excessive aggression can be eliminated or reduced.
Accordingly, a further aspect of the disclosure herein is a method of producing dogs that are more likely to exhibit a sociable behavior comprising:
- (b) mating the dogs of step (a) to produce offspring.
The disclosure herein also includes a method of producing dogs that are more likely to exhibit a sociable behavior comprising:
- (a) genotyping male and female dogs for the presence of structural variants within the Williams-Beuren Syndrome (WBS) locus;
- (b) selecting a male and female dog that each have at least one structural variant in Cfa6.6, Cfa6.7, Cfa6.66, or Cfa6.83 in the WBS locus; and
- (c) mating the dogs of step (b) to produce offspring.
In some embodiments, the structural variant is at Cfa6.6, Cfa6.7, Cfa6.66, and Cfa6.83. In some embodiments, the structural variant occurs within at least one gene selected from the group consisting of GTF2I, GTF2IRD1, and WBSCR17.
Disclosed herein is a method of editing the genome of a dog comprising:
- (a) obtaining a dog;
See Zou et al., Journal of Molecular Cell Biology (2015), 7(6), 580-58.
In some embodiments, the dog is obtained because it is desirable to increase the sociability of the dog.
In some embodiments, the gene is GTF2I, GTF2IRD1, or WBSCR17.
A further aspect of the disclosure herein is a kit for detecting the presence of structural variants within the Williams-Beuren Syndrome (WBS) locus of canines. The kit may comprise one or more primers suitable for use in PCR-based processes for detecting the structural variants. Such primers include:
In some embodiments, the kit comprises the primers CCCCTTCAGCCAGCATATAA (SEQ ID NO: 1) and TTCTCTGGGCTGTCTGGACT (SEQ ID NO: 2).
In some embodiments, the kit comprises the primers AAGTTTCTCTGATGGAAAACACA (SEQ ID NO: 3) and GGTGGCTGGAAATTTCAGTAG (SEQ ID NO: 4).
In some embodiments, the kit comprises the primers TGGAGCCATGATTAGGAAGG (SEQ ID NO: 5) and TAAGGAAGGACCCCATTTCC (SEQ ID NO: 6).
In some embodiments, the kit comprises the primers TGCTGCTTCATGTTCTGTGA (SEQ ID NO: 7) and TGGTGCATTAGCTTTGGTTG (SEQ ID NO: 8).
In some embodiments, the kit comprises the primers AACCACAGGAACAAAACCTCA (SEQ ID NO: 9) and CCTCCTGTTGGACATTTGGA (SEQ ID NO: 10).
In some embodiments, the kit further comprises instructions for use. In another embodiment, the primers are labeled using a detectable marker. The kit may further comprise at least one additional reagent such as buffers, dNTPs, DNA polymerases, DNA ligases, and restriction enzymes.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
“Detectable marker” refers to a moiety attached to an entity (such as a probe) to render the entity detectable. The moiety itself need not be detectable; it may become detectable upon reaction with yet another moiety. Detectable markers include fluorophores, chromophores, radioactive isotopes, chemiluminescent agents, haptens, and magnetic particles.
“Genotyping” refers to structural analysis of the Williams-Beuren Syndrome locus on canine chromosome 6 that provides information regarding the presence of structural variants in the WBS locus. Genotyping may be accomplished by any means known in the art, e.g., DNA sequencing, the use of PCR followed by agarose gel electrophoresis, or hybridization assays,
“Hyper-sociability” refers to a heightened propensity to initiate social contact that is often extended to members of another species.
The present inventors have determined that structural variants in genes associated with human Williams-Beuren Syndrome underlie stereotypical hyper-sociability in domestic dogs. Accordingly, disclosed herein are genetic variants associated with human-directed hyper-social behavior in domestic dogs and a method to detect the same.
A candidate locus associated with WBS in humans and known to be under positive selection in the domestic dog genome (19) was identified and resequenced. It was found that this region also harbors a large number of highly polymorphic structural variants (SVs) in canines, some of which are private to an individual dog or breed. This finding is concordant with the genetic heterogeneity of WBS in humans, where deletions range from 100 Kb to 1.8 Mb in size with variable breakpoints, attributed to chromosomal instability (42-44). SVs found in multiple individuals were identified that were significantly associated with one or more quantified behavioral traits informative on hyper-sociability and cognition.
Domestic dogs exhibit some of the key behavioral traits quantified in individuals with WBS, most notably hyper-sociability in the absence of superior social cognition. A 5 Mb genomic region on chromosome 6 previously found to be under positive selection in domestic dog breeds was analyzed by the present inventors. Deletion of this region in humans is linked to Williams-Beuren syndrome (WBS), a multi-system congenital disorder characterized by hyper-social behavior. Quantitative data on behavioral phenotypes symptomatic of WBS in humans were associated with structural changes in the WBS locus in dogs. It was found that hyper-sociability, a central feature of WBS, is also a core element of domestication that distinguishes dogs from wolves. Evidence is provided herein that structural variants in GTF2I and GTF2IRD1, genes previously implicated in the behavioral phenotype of patients with WBS and contained within the WBS locus, contribute to extreme sociability in dogs. This finding suggests that there are commonalities in the genetic architecture of WBS and canine tameness, and that directional selection may have targeted a unique set of linked behavioral genes of large phenotypic effect, allowing for rapid behavioral divergence of dog and wolf, facilitating co-existence with humans.
A third described gene, WBSCR17, has not been previously associated with sociability. However, this gene is up-regulated in cells treated with N-acetylglucosamine, a glucose derivative, suggesting a role in carbohydrate metabolism (54). SVs in WBSCR17 may represent an adaptation to a starch-rich diet typical of living in human settlements, a speculation concordant with a previous study (55).
Two of the SVs most associated with hyper-sociability, a trait uniquely displayed in domestic dogs among the canids, were SINE and LINE transposable elements, sub-types of retrotransposons that have high rates of insertion (e.g., 1 in 108 human births have a de novo L1 insertion; 56). With large phenotypic consequences due to the amplification of a few loci, these mobile elements have been implicated in the evolution of the canid genome (e.g., 57,58), as well as canine disease, syndromes, and morphology (e.g., 59-64).
These TEs were surveyed in an extended sampling of wild and domestic canines and found to be extremely rare in coyotes, while other insertions were derived and found only to segregate within domestic dogs. With a larger sample size and leveraging behavioral phenotypes from breed stereotypes, a significant association was found between TE copy number and behavior. Hence, it is conceivable that selection acting on hyper-sociability-associated TEs may have helped shape the evolution of the canid family. Canine WBS-linked SVs likely contribute to the developmental delay that facilitates ease of forming inter-species bonds and the juvenile-like hyper-sociability exhibited towards these social companions into adulthood. This coupling presents an intriguing parallel to the same processes observed in WBS affected individuals (20).
The genetic variants disclosed herein are associated with hyper-social behavior in domestic dogs and wolves, and will allow for a test to identify domestic dogs with predispositions for behavioral disorders or traits that make them more or less suited for placement in certain homes or working roles. This test might similarly be used in captive wolves to inform breeding practices. The disclosed approach allows for a commercial test to genotype dogs for the presence (or absence) of these genetic variants. In some embodiments, the disclosed test is a PCR-based test of specific genetic loci that are informative regarding the genetic influence for behavior.
A commercial genetic test employing the disclosed approach can genotype and count the number of genetic variants carried by each individual dog. The presence or absence of each variant can be assessed for a probability of how much more (or less) social the dog is as a direct result of the genotype, referred to as the allelic effect.
Some embodiments of the methods disclosed herein utilize primers or probes. Primers and probes may be oligonucleotides of at least 15 nucleotides in length. Primers are usually 15 base pairs to 100 base pairs in length, and preferably are 17 base pairs to 30 base pairs in length. The primer is not particularly limited as long as it is capable of amplifying at least a part of a DNA comprising the canine Williams-Beuren Syndrome locus on chromosome 6. The length of DNA which primers amplify is usually 15-1000 base pairs, preferably 20-500 base pairs, and more preferably 20-200 base pairs. When the oligonucleotide is used as a probe, its length is usually 5 base pairs to 200 base pairs, preferably 7 base pairs to 100 base pairs, more preferably 7 base pairs to 50 base pairs. The probe is not particularly limited as long as it is capable of hybridizing to a DNA comprising the canine Williams-Beuren Syndrome locus on chromosome 6.
In preferred embodiments, the primers are used in pairs that together amplify a region of the canine Williams-Beuren Syndrome locus on chromosome 6 that includes a structural variant. In preferred embodiments, the region is a region from at least one of GTF2I, GTF2IRD1, and WBSCR17.
In some embodiments, the probes hybridize to the canine Williams-Beuren Syndrome locus on chromosome 6 in which at least one of GTF2I, GTF2IRD1, and WBSCR17 does not contain a structural variant but do not hybridize to the canine Williams-Beuren Syndrome locus on chromosome 6 in which at least one of GTF2I, GTF2IRD1, and WBSCR17 does contain a structural variant. In some embodiments, the hybridization conditions are stringent hybridization conditions (see, for example, the conditions disclosed in Sambrook et al., Molecular Cloning, Cold Spring Harbor Laboratory Press, New York, USA, the 2nd edition, 1989).
A person of ordinary skill in the art would be able to design appropriate primers and probes for the methods disclosed herein based on the teachings herein with respect to GTF2I, GTF2IRD1, and WBSCR17 and the dog reference genome.
In some embodiments, the probes are immobilized on a solid phase. Examples of solid phases include, but are not limited to, microplate wells, plastic beads, nylon membranes, and magnetic particles.
The human-directed sociability of 18 domestic dogs and ten captive human-socialized gray wolves was evaluated using standard sociability (26,27) and problem solving tasks (2,8,28) commonly used to assess human-directed sociability in canines. Three sociability metrics were constructed to assess behaviors indicative of WBS (22): attentional bias to social stimuli (ABS), hyper-sociability (HYP), and social interest in strangers (SIS) (Tables 1, 2).
Solvable task performance was used to assess attentional bias towards social stimuli and independent problem solving performance (independent physical cognition). Subjects were given up to two minutes to open a solvable puzzle box (8) that contained half of a 2.5 cm thick piece of summer sausage, both when alone and with a neutral human present. The trial was considered complete after meeting one of the following conditions: the puzzle box lid was completely removed, the food obtained, or two minutes elapsed. All trials were video recorded and coded for whether the puzzle box was solved and the time to solve it. To compare attention towards the puzzle box versus social stimuli in the human-present condition, the percentage of time spent looking at the puzzle box, touching the puzzle box, and looking at the human was recorded (8). An independent researcher, who was blind to the purpose of this study, coded 30% of the videos, and found that inter-rater reliability was very strong (weighted Cohen'"'"'s kappa, K 0.98; 95% confidence interval: 0.97-0.99). Domestic dogs spent a significantly greater proportion of trial time gazing at the human when compared to wolves when a human was present during the solvable task (median gaze towards human: dog=21%, wolf=0%; Two tailed Mann-Whitney, ndog=18, nwolf=10, U=6, p<0.0001). Dogs also spent a significantly smaller proportion of trial time looking at the puzzle box (median gaze towards box: dog=10%, wolf=100%; Two tailed Mann-Whitney, ndog=18, nwolf=10, U=171.5, p=0.0001) and a significantly smaller proportion of trial time trying to solve the puzzle (median dog=6%, wolf=98%; Two tailed Mann-Whitney, ndog=18, nwolf=10, U=175, p<0.0001) compared to wolves, a finding that has been equated with social inhibition of problem solving behavior in both the canine (9) and human WBS literature (22). Significantly more wolves successfully solved the task when compared to dogs in both the human present and alone conditions (Human present: 2/18 dogs successful, 8/10 wolves successful, Two-tailed Fisher'"'"'s exact test, p=0.0005; Alone: 2/18 dogs successful, 9/10 wolves successful, Two-tailed Fisher'"'"'s exact test, p=0.0001). Overall, concordant with WBS, dogs displayed greater ABS than wolves, corresponding to a reduction in independent problem solving success (
The sociability test measured human-directed proximity seeking behavior, and was assessed by comparing total sociability scores across all sociability conditions. Each phase occurred twice, once with an unfamiliar human and once with a familiar human, totaling four phases run over eight consecutive minutes. In all phases, the experimenter sat on a familiar chair (dogs) or bucket (wolves) inside a marked circle of 1 m circumference denoting proximity. During the passive phase, the experimenter sat quietly on the chair or bucket and ignored the subject by looking down toward the floor. If the animal sought physical contact, then the experimenter touched the subject twice, but did not speak or make eye contact with the animal. During the active phase, the experimenters called the animals by name and actively encouraged contact while remaining in their designated location. Dogs spent more time in proximity to humans than did wolves (median percent of time spent within 1 m of humans: dogs=65%, wolves=35%; Two tailed Mann-Whitney, ndog=18, nwolf=9, U=30, p<0.005). Dog and wolf sociability towards an unfamiliar human was used to assess social interest in strangers. Dogs spent more time within 1 m of a stranger when compared to wolves (median dogs=53%, wolves=28%) however this difference was not statistically significant (Two tailed Mann-Whitney, ndog=18, nwolf=9, U=76, p=0.51). In summary, dogs were hyper-social compared to wolves, although there was no significant difference in their social interest in strangers (
The dimensionality of six behavioral traits (Table 3) were reduced to three components that are orthogonal and uncorrelated to each other, whereas ABS, HYP and SIS are correlated.
Principal Components 1, 2, and 3 accounted for 50%, 22%, and 14% of total behavioral variation, respectively. Both KMO (KMO=0.62, with values>0.6 recommended as informative) and Bartlett'"'"'s test, which was significant (X2(15)=60.42, p=2.13×10−07) were calculated. Analysis of the loadings of the constituent behaviors (Table 3;
In a subset of animals with quantitative behavioral data (ndog=16; nwolf=8), paired-end 2×67 nt sequence data were collected from 5 Mb spanning the candidate canine WBS locus on canine chromosome 6 (2,031,491-7,215,670 bp) which contains 46 annotated genes, 27 of which are in the human WBS locus (Tables 4, 5). The target region had an average of 15.5-fold sequence coverage (dogs: 15.2; wolves: 16.0) (Table 4).
Genotypes for 26,296 SNPs were obtained, which were further filtered to retain 4,844 SNPs with non-missing polymorphic data (average density of 1 SNP every 14.4 Kb). To confirm this region as containing species-specific variation, it was determine if this region displays signals of positive selection in the dog genome, an effort to independently validate the original finding (19). The composite bivariate percentile score was calculated and confirmed that the candidate gene, WBS Chromosome Region 17 (WBSCR17), is under positive selection as a domestication candidate and was significantly depleted of heterozygosity in the dog (mean HO: dog=0.01, wolf=0.37; 1-tailed t-test with unequal variance, p=7.4×10−38) (
As this candidate region shows structural variation (SV) linked to WBS in humans (20), and is known to vary widely in its functional consequences (e.g., neurodevelopmental diseases ; autism spectrum disorders ), in silico SV annotation in the dog and wolf genomes was completed using three programs—SVMerge (31), SoftSearch (32), and inGAP-SV (33), which together utilize all available SV detection algorithms: read pair (RP), short reads (SR), read depth (RD), and assembly-based (AS). 38 deletions, 30 insertions, 13 duplications, six transpositions, a single inversion, and one complex variant relative to the reference dog genome were annotated (Tables 6, 7).
There was considerable private variation, with 31 annotated SVs found only in dogs, 26 found only in wolves, and a level of heterogeneity observed in wolves that is comparable to that found in human WBS (34) (mean n: wolf=21, dog=15, 2-tailed t-test p=0.026) (Table 8).
Linear mixed models were used to determine the association of SVs with human-directed sociability. Three univariate models were tested for their association with each of the three behavioral indices (ABS, HYP, SIS) (
In addition, two intergenic SVs were significantly associated with ABS (Cfa6.69, p=1.56×10−4; Cfa6.27, p=3.31×10−4), and Cfa6.27 was also associated with the PCs (p=1.24×10−4). However, the analyses were focused on genic SVs to infer any potential functional impact. Cfa6.66 was associated with multiple sociability metrics (ABS and SIS) and had the strongest two association signals (p=1.38×10−4 and p=1.95×10−4, respectively) (Table 9). GTF2I and GTF2IRD1 are members of the TFII-I family of transcription factors, a set of paralogous genes which have been repeatedly linked to the expression of hyper-sociability in mice (35,36), and are specifically implicated in the hyper-sociable phenotype of persons with WBS (37,38).
To disentangle the association of SVs with behavior from an association with species membership, species was incorporated as a covariate (Table 10).
These analyses were consistent with the initial findings for Cfa6.66, Cfa6.3 and Cfa6.7. Locus Cfa6.66 remained significantly associated with multiple sociability metrics (ABS, p=2.33×10−4; SIS, p=1.67×10−3) and showed the strongest association of any genic SV. Cfa6.3 and Cfa6.7 both retained their associations with ABS (p=1.06×10−3 and p=9.56×10−4, respectively), as did the intergenic SVs Cfa6.69 (p=1.36×10−4) and Cfa6.27 (p=5.56×10−4). Furthermore, the ABS effect size (β) remained stable for the association models with and without species membership as a covariate (ABS β without covariates: Cfa6.3=0.11, Cfa6.7=0.12, Cfa6.27=−0.15, Cfa6.66=−0.23, Cfa6.69=−0.15; ABS 3 with covariates: Cfa6.3=0.081, Cfa6.7=0.10; Cfa6.27=−0.13; Cfa6.66=−0.23; Cfa6.69=−0.14), indicating that the observed effects on sociability are not an artifact of species differences. An association test of each locus with species membership further supports this interpretation as none of the behavior-associated SVs significantly associated with species membership alone (Table 11).
It was next determined whether these behavior-associated SVs were predicted to have a functional impact. Ensembl'"'"'s Variant Effect Predictor (VEP) v84 (39) was used with Ensembl transcripts for the CanFam 3.1 reference genome to assign putative functional consequences to all insertions, deletions, and duplications in the filtered set of SVs. Due to a software limitation that VEP is unable to assign consequences for transitions, inversions, and complex SV, seven sites (6 TRA, 1 INV, 1 D_I) in the UCSC genome browser were manually inspected with Ensembl gene models (40). Three transcription ablations, seven loss-of-start codons, and five transcript amplifications (Table 12) were found.
All SVs significantly associated with human-directed social behavior were ‘feature truncations’, except for Cfa6.3, which was a ‘feature elongation’ that likely is due to a lost stop codon or the elongation of an internal sequence feature relative to the reference. Annotation of Cfa6.3, Cfa6.7, Cfa6.66 and Cfa6.72 as modifiers of gene function suggests a direct association between these variants and human-directed social behavior as quantified by behavioral measures, mediated by possible interference with WBSCR17, GTF2I and GTF2IRD1.
The in silico SV detection algorithms applied to the targeted resequencing data can identify the presence or absence of an SV, but cannot predict the underlying genotype of an individual for a given SV. To corroborate the in silico findings and investigate the possibility of other genetic models, PCR amplification and agarose gel electrophoresis were used to determine the codominant genotypes at the top four loci (Cfa6.6, Cfa6.7, Cfa6.66, and Cfa6.83) (
All outlier SVs, now with co-dominant genotypes, were significantly associated with species membership (Cfa6.6 χ2=23.91, p=1.01×10−6, OR=0.33; Cfa6.7 χ2=57.63, p=30.16×10−14, OR=13.83; Cfa6.66 χ2=35.12, p=3.1×10−9, OR=0.25; Cfa6.83 χ2=17.11, p=3.53×10−5, OR=NA), confirming this region'"'"'s original identification (19). Similar results were obtained if “modern” breeds only were included, as per the original method that located this region (19) (Cfa6.6 χ2=11.9, p=0.0006, OR=0.45; Cfa6.7 χ2=40.87, p=1.63×10−10, OR=10.35; Cfa6.66 χ−=41.97, p=9.25×10−11, OR=0.20; Cfa6.83 χ2=20.41, p=6.24×10−6, OR=NA), with site-specific patterns (frequency of TE insertion in modern dogs and wolves, respectively: Cfa6.6=−0.52 and 0.32; Cfa6.7=0.39 and 0.06; Cfa6.66=0.10 and 0.37; Cfa6.83=0.17 and 0.00).
The frequency of insertions per locus by population or species membership was calculated. The TEs segregated at low frequencies in coyotes and were variable across wolf populations and dog breeds (
One-way ANOVA was conducted using the population or species designation as a predictor of the total number of insertions across four outlier loci. The total number of insertions significantly depends on the population (F(23,274)=19.54; p<2×10−16), with 103 of 276 pairwise population mean comparisons contributing to the ANOVA significance (dog/dog=46, wolf/dog=28, coyote/dog=11, semi-domestic/dog=8, semi-domestic/coyote=3, semi-domestic/wolf=3, wolf/coy=2, and wolf/wolf=2; Tukey HSD, p<0.05) (
As the gel-based genotyping method now reveals a co-dominant genotype compared to the in silico status, an association scan was conducted for each of the four outlier SV loci with the binary phenotype for each AKC breed (41), village dogs and pariah dogs as “Seeks attention” or “Avoids attention” using two logistic regression models in R, an additive and dominant model, with sex as a covariate. The use of breed-based stereotypes is supported by the strict genetic isolation and selective breeding efforts that maintain breeds. As such, many traits strongly determined by genetic variation (including behavioral) can be predicted with high accuracy. The central foundation and advantage of domestication and breed formation is that selection for many traits, including behavior, has been very strong and, thus, the number of underlying genes is apt to be small. As a proof of principal, Jones et al. successfully mapped a variety of breed-associated traits in a genome-wide association study using dog “stereotypes” (9). They scored breeds for pointing, herding, boldness, and trainability, and identified one locus associated to pointing, three for herding, and one for trainability. Most importantly, they found five for boldness. These loci contain likely candidate genes, many of which are important in schizophrenia, dopamine receptors, and proteins linked to synaptic junctions. Vaysse et al. (16) also utilized breed stereotypes to map behaviors, such as boldness, sociability, curiosity, playfulness, chase-proneness, and aggressiveness. They mapped boldness to an intron of HMGA2, and sociability, defined as the “dog'"'"'s attitude towards unknown people”, to a gene on the X chromosome, after excluding male dogs from the analysis to accurately compare autosomal and sex-chromosome patterns of genetic variation.
Significant support was found for an association between three of the four loci and the binary behavioral trait of seeking or avoiding attention (additive model: Cfa6.6 OR=0.303 p=2.79×10−10, Cfa6.7 OR=0.398 p=4.66×10−7, Cfa6.83 OR=2.95 p=2.83×10−4; dominant model: Cfa6.6 OR=0.184 p=8.22×10−7, Cfa6.7 OR=0.287 p=4.31×10−5, Cfa6.83 OR=5.04 p=6.50×10−4; sex was not a significant predictor in any of these models). SV Cfa6.66 was not significant (additive model: OR=0.852 p=0.496; dominant model: OR=0.573 p=0.124). Further, logistic regression found that TE copy number could significantly predict the binary breed stereotype behavior of attention seeking or avoidance (OR=0.676 per insertion, p=1.13×10−5 with no evidence of a sex effect).
To identify additional candidate loci, genome-wide SNP genotypes were collected using the Affymetrix Axiom K9HDSNPA (643,641 loci) and Axiom K9HDSNPB (625,577 loci) arrays. A PCA was conducted on 544K genome-wide SNP genotypes to ensure the expected spatial clustering pattern of the samples. With a subset of 25,510 uncorrelated and unlinked SNPs, a PCA confirmed the discrete spatial separation of the two species (PC1, 29.9%; PC2, 11.8%) (
Dogs and wolves were ensured to be in the same developmental stage by only including subjects over one year of age, well past the species-specific window for primary socialization. All dogs and wolves were socialized to humans as puppies, received daily contact from human caretakers, and experienced regular free-contact interactions with unfamiliar humans from puppyhood through the time of this study. To ensure the wolves used in this study had been socialized to accepted standards and were as familiar to their caretakers as possible, wolves were only included if they had been hand-reared by humans from before 10-14 days of age following the procedures established by Klinghammer & Goodman (70), and were still living in the same facility in which they were raised. Wolves experienced 24-hour contact with human caretakers for at least the first six weeks of life, followed by contact during daylight hours until four months of age and then daily human interaction with caretakers and other humans thereafter. Therefore, in the current study, the lower level of sociability displayed towards familiar individuals by wolves in comparison to pet dogs could not be explained by lack of initial bond formation (socialization) or insufficient familiarity with their caretakers. In fact, wolves did show social interest in their caretakers, approaching them for greetings when they entered during the sociability test in this study. However, they then returned to other activities. This pattern of behavior might be considered a ‘typical’ social greeting for bonded adult animals, whereas the prolonged greeting of pet dogs, sometimes lasting the full two minutes, would be considered exaggerated or hyper-social (7).
To ensure equivalent testing conditions each species was tested in a controlled setting most constant with their home environment (71). Dogs were individually tested at an indoor location in Corvallis Oreg., USA; wolves were tested in a familiar outdoor enclosure at Wolf Park, Battle Ground Indiana, USA. Testing procedures were the same for both species. Each subject was assessed using two tests designed to quantitatively probe their human-directed sociability along indices relevant to the clinical presentation of WBS: a solvable task test and a sociability test (7,8). Data from the solvable task test and sociability test were used to calculate three indices relevant to behaviors that typify WBS in humans: attentional bias to social stimuli (ABS), hyper-sociability (HYP), and social interest in strangers (SIS) (Table 15).
Those tests are described in detail in the following sections.
Solvable Tasks and Sociability Measures.
The solvable task test was used to measure individual problem solving performance, attentiveness to humans and the degree to which a familiar human'"'"'s presence interfered with independent problem solving behavior. Although this problem-solving task is considered challenging, it has previously been validated as physically solvable by wolves, small dogs, and large dogs (8). All subjects were naïve to the problem prior to testing and humans were instructed to remain passive and neutral after placing the container on the ground.
The sociability test consisted of a passive and an active phase, each lasting two minutes. One wolf (ID 2794) was not available for sociability testing, therefore sociability analysis was conducted on all 18 dogs and 9 wolves. The experimenter spoke to and touched the subject if the animal came close enough to reach while remaining on the bucket or chair. If the animal moved away, then the experimenter called his/her name again to regain the subject'"'"'s attention. All trials were recorded on video. For each condition, videos were coded for time spent in proximity to the experimenter, and time spent touching the experimenter (27). An independent coder blind to the purpose of this study double coded 42% of these videos; inter-rater reliability was determined to be strong using a weighted Cohen'"'"'s kappa, K=0.75 (95% confidence interval: 0.64-0.86) (72).
It should be noted that many of the wolves in the current study have participated and performed as well as or better than pet domestic dogs on tasks related to social cognition (using human cues to solve problems) (26). In the current study they quickly approached the humans to initiate a greeting or to receive the puzzle box. The key difference observed was that adult dogs were more likely to engage in prolonged or exaggerated contact with humans than adult wolves.
Behavioral Indices Relevant to WBS in Humans.
Data from the solvable task test and the sociability test were used to quantify canine behavior along indices relevant to the sociable phenotype of WBS including: 1) time spent looking at the puzzle box in the solvable task test (“time look box”), 2) time spent looking at the human in the solvable task test (“time look human”), time spent in proximity to a familiar experimenter in the 3) active and 4) passive phases of the sociability test (“proximity familiar active” and “proximity familiar passive”), and time spent in proximity to an unfamiliar experimenter in the 5) active and 6) passive phases of the sociability test (“proximity unfamiliar active” and “proximity unfamiliar passive”).
Data from the solvable task test and sociability test were used to calculate three indices relevant to the behavior under selection during dog domestication and analogous to behaviors that typify WBS in humans: attentional bias to social stimuli (ABS), hyper-sociability (HYP), and social interest in strangers (SIS). ABS was calculated as the ratio of time spent looking at the experimenter to the sum of the time spent looking at the experimenter and the time spent looking at the puzzle box in the solvable task test and was intended to quantify the proportion of the animal'"'"'s attention directed towards the experimenter. HYP was calculated as the sum of the time spent in proximity to the experimenter in each phase of the sociability test and was intended to quantify engagement with humans across social scenarios. SIS was calculated as the sum of the time spent in proximity to the experimenter in the two unfamiliar phases of the sociability test and was intended to quantify engagement with unfamiliar persons (Tables 2, 15).
Principal Components Analysis of Behavioral Indices.
Dog and wolf behavior was also characterized by principal components analysis using data from the Solvable Task Test (8) and Sociability Test (73) (Table 2) with the prcomp function in R (http://www.r-project.org/).
Inclusion of PCs was assessed with the nFactors package in R (74). The majority of component retention analyses indicated inclusion of the top two principal components (Kaiser'"'"'s Rule: 2, Horn'"'"'s parallel analysis: 2, acceleration factor: 2, optimal coordinates: 1). However, it was found a relatively low percentage of behavioral variation was explained by the first two principal components (cumulatively, 72%) and a lack of an obvious knee in the scree plot (
Following behavioral trials, 2-3 ml of blood was collected from each dog and wolf from the cephalic, saphenous or jugular vein depending on the individual, temperament, and accessibility of the vein. Blood was deposited into a sterile blood collection tube, labeled, and then immediately placed in a freezer kept below −18 degrees Celsius until shipped overnight on ice for analysis. 24 out of 28 samples were chosen to sequence (n, dogs=16, wolves=8). Two of the original 18 dogs were removed from sequencing due to their low DNA yield; two of the original 10 wolves were excluded from sequencing due to the lack of an opportunity to redraw blood samples from these individuals, either due to our institutional protocols or due to the unavailability of the individual (Tables 1, 4). Genomic DNA was prepared from blood samples using QIAamp DNA mini kits (Qiagen, DNeasy blood and Tissue kit). DNA was quantified using a Qubit 2.0 Fluorometer and checked on a 2% agarose gel for degradation. A region under positive selection in the domestic dog genome on chromosome 6 that was identified from a genome-wide scan of 48,036 SNPs (19) was followed up on, through targeted resequencing of a ˜5 Mb contiguous block (2,031,491-7,215,670 bp) that contained 46 Ensembl-annotated genes (40,76), 27 of which have been described in WBS (Table 16).
A full-service option offered by MYcroarray for DNA enrichment and genomic library preparation was used. 80mer bait probes to target the region of interest (MYbaits kit design) were designed. Genomic DNA was sonicated to approximately 300 bp fragment sizes, of which 500 ng were used to construct Illumina TruSeq sequencing libraries. Each library was dual-index-amplified for eight cycles of PCR, yielding between 590 ng and 1744 ng of the sequencing library. Of this, 500 ng was used for the target enrichment with a custom MYbaits kit. Following enrichment, libraries were amplified for six cycles, yielding between 6.7 ng and 14.7 ng of library. Libraries were standardized by pooling 5 ng from each library to a volume of 30 uL at 4 ng/uL for paired-end 2×67 nt sequencing in a single lane of Illumina HiSeq2500. Refer to Table 5 for enrichment summary statistics.
For strict deplexing, sequences with perfect matches between the observed and expected index sequence tags were retained. Reads were trimmed and clipped with cutadapt-1.8.1 (77) to discard reads that were <20 bp in length, exclude sites of low quality (<20), and remove remnant TruSeq adapter sequence. Mean and standard deviations of library insert sizes were calculated individually for each animal with a custom python script (https://gist.github.com/davidliwei/2323462). All reads were mapped to the unmasked reference dog chromosome 6 (CanFam3.1, Ensembl) generated from a boxer breed individual with BWA-0.7.12 (78). PCR duplicates were marked and removed with picard-tools-1.138 (http://picard.sourceforge.net). BAM files were then indexed, sorted, and VCF files produced from SAMtools (79), from which sequencing descriptive statistics were calculated. From the sorted BAM files, ANGSD (80) was used to call SNP genotypes with a minimum depth of 10× sequence coverage, a minimum mapping quality 30, SNP p<0.00001 and posterior probability>0.95, and a minimum variant quality of 20. Scores were also adjusted around insertions/deletions with the -baq flag. Monomorphic sites were excluded.
SNP genotypes were phased with SHAPEIT (81). The region was scanned for signals of positive selection in the dog genome using cross population extended haplotype homozygosity (XP-EHH ) of 4,844 SNPs within the resequenced region. Per-SNP FST was calculated with a custom script (19). Both the FST and XP-EHH scores were normalized into a z-score to have a mean of zero and standard deviation of 1. The product of their z-scores represented their composite “bivariate percentile score”. The empirical rule was used to identify outlier loci in the 97.5th percentile or greater (z score>2). Peaks of selection had to contain at least three outlier loci to be considered.
Briefly, SVMerge is a SV-detection platform which implements the RP algorithm BreakDancer (83), RP and SR algorithm Pindel (84), and an algorithm that clusters single-end mapped reads to detect insertions (85). The SVMerge pipeline implements its constituent SV callers, filters and merges the variant calls, then computationally validates breakpoints by Velvet de novo assembly (85). Softsearch is a RP and SR algorithm that is also the only available SV detection platform, which has been experimentally validated for high performance with custom resequencing data (86,87). InGAP-SV is an RD and RP algorithm that uses depth of coverage signatures to identify putative SVs, then refines and categorizes the variants based on RP signals (88). By integrating the output of these three programs, the strengths of all available SV detection algorithms were leveraged and incorporated the best available method for custom resequencing data (
Default parameters were used for each SV calling platform, except where a minimum of 25× sequence coverage across all platforms was used to call an SV and a minimum of five reads to form a single-end cluster (Table 17).
As gaps in highly repetitive regions of the reference genome represent the primary source of false positives in SV discovery (89,90), SV calls from all platforms were filtered with a custom script that removed all variant calls with breakpoints that fell inside gaps, microsatellites, and tandem repeats in the reference genome annotated by the UCSC Table Browser (91). The filtered sets of SV output by each program were merged into a final table and then clustered into a single event if both breakpoints fell within 200 base pairs of each other (92) (
The univariate linear mixed model implemented in the program GEMMA (93) was used to test for associations between SVs and each of the three behavioral indices. GEMMA'"'"'s univariate module fits a set of genotypes and corresponding phenotypes to fit a univariate linear mixed model that accounts for fixed effects, population stratification, and sample structure. For each variant, the univariate model tests the alternative hypothesis H1: β≠0 against the null hypothesis H0: β=0, using the Wald, likelihood ratio, and score test statistics, where β is the effect size of each variant on the phenotype of interest. Population stratification is accounted for using either a centered or standardized relatedness matrix as a random effect, where the authors recommend a centered matrix for non-human organisms. Three univariate models were thus implemented: the first estimating associations between SVs and attentional bias to social stimuli (ABS model), the second between SVs and hyper-sociability (HYP model), and the third between SVs and social interest in strangers (SIS model). For each univariate model, the centered relatedness matrix was estimated from SNP genotypes in the target region by GEMMA, and incorporated to account for relatedness and population structure among the samples. SNP genotypes were used in calculating the relatedness matrix in place of SV genotypes, as there was more than an order of magnitude more SNP genotypes than SV genotypes (4844 vs. 89) on which to base the estimation. Negative values in the relatedness matrix, indicating that there was less relatedness between a given pair of individuals than would be expected between two randomly chosen individuals, were set to 0 in the resulting matrix (94,95). Sex and age were used as covariates. Only SV with minor allele frequency (MAF)>0.025 were tested (96). The Bonferroni correction for multiple comparisons was used in conjunction with the simpleM method for accounting for linkage disequilibrium among variants (97) to establish significance thresholds. With simpleM (http://simplem.sourceforge.net/), the effective number of independent tests were estimated as Meff=21, corresponding to a significance threshold of p=2.38×10−3 (Bonferroni cutoff of α=0.05 for 21 independently tested SVs). The likelihood ratio test was used to determine p-values. Because the ABS phenotype was calculated as a proportion, the arcsin transformation was applied before all analyses; all other phenotypes were not transformed.
GEMMA'"'"'s multivariate linear mixed models estimate the association between a given variant and all phenotypes of interest simultaneously, accounting for the correlation between the phenotypes and generally exhibiting greater statistical power than univariate linear mixed models. Specifically, GEMMA'"'"'s multivariate module fits a set of genotypes and corresponding phenotypes to a multivariate linear mixed model that accounts for fixed effects, population stratification and sample structure. For each variant, GEMMA tests the alternative hypothesis H1: β≠0 against the null hypothesis H0: β=0, using the Wald, likelihood ratio, and score test statistics, where 3 is the effect size of each variant for all phenotypes. In addition to the univariate models implemented for each phenotype individually, GEMMA'"'"'s multivariate linear mixed model was used to estimate associations between SVs and several behavioral phenotypes simultaneously. Two multivariate models were implemented with the same model parameters and data transformation used in the univariate models: one estimating associations between SVs and the indices of human-directed sociability (Behavioral Index model) and the other estimating associations between SVs and the first three PCs of social behavior (PC model).
To investigate the possibility that SVs are associated with species membership (dog versus wolf), an association scan of each SV locus with species membership was conducted with PLINK (98) (Table 11). Variants strongly associated with social behavior, but not species membership, are particularly robust candidates for mediators of social behavior.
An attempt was made to design primers flanking all SVs significantly associated with human-directed social behavior (Table 9) as well as two other SVs that were suggestive of an association but did not pass the significance threshold (univariate model: HYP and Cfa6.6, β±se=−138.8±33.62, p=5.75×10−3; ABS and Cfa6.83, β±se=−0.0640.09, p=6.90×10−3). Primers were designed based on the dog reference genome (Canfam3.1) with Primer3 (99) (Table 19).
Primers that amplified Cfa6.3 and Cfa6.72 could not be designed, and thus high-confidence codominant genotypes could only be obtained for Cfa6.6, Cfa6.7, Cfa6.66, and Cfa6.83. Cfa6.3 is ˜40 bp downstream of a 300 bp gap in the reference genome. It is possible that this gap caused a false positive during the in silico annotation of this locus, as any sequencing into the gap would not map to the reference and could instead be interpreted as an insertion by SV annotation algorithms.
For the 24 dogs and wolves in the targeted resequencing study, along with a broader sampling of wild canids and dog breeds, each SV locus was PCR amplified and genotypes were called based on banding patterns in agarose gel electrophoresis (
PCR was used to amplify and electrophoresis methods to genotype four SVs in a panel of wild canids (n, gray wolves: Europe=12, India/Iran=7, China=3, Middle East=14, North America=15; coyotes n=13), the 16 domestic dogs from the initial sequencing efforts, and 201 domestic dogs from 13 AKC registered breeds (n, dogs: Alaskan Malamute=13, Bernese Mountain dog=20, Border Collie=20, Boxer=13, Basenji=7, Cairn Terrier=18, Golden Retriever=16, Great Pyrenees=17, Jack Russell Terrier=17, Miniature Poodle=10, Miniature Schnauzer=16, Pug=19, Saluki=15). 17 semi-domestic populations were also genotyped, representing New Guinea Singing dogs (NGSD, n=3), Pariah dogs from Saudi Arabia (n=4), and village dogs from two locations (Africa, n=5; Puerto Rico, n=5). Though an ideal design would include a large sampling of individuals from an experimental dog-wolf cross (e.g. F1 hybrids and backcrosses), this is not possible to construct in the United States as it would require generating an animal colony with years of selected breeding. An alternative method would be to explore genome editing with CRISPR/Cas9, which has only recently been shown to work in canines (100).
Breeds from across multiple breed-type clades were selected, representing different ancestries and behavioral functions. Each breed was phenotyped according to AKC behavioral stereotypes (41) into a category of either seeking or avoiding attention (Seeks attention: Bernese Mountain dog, Border collie, Boxer, Golden retriever, Jack Russell terrier, Miniature poodle, Pug; Avoids attention: Alaskan malamute, Basenji, Cairn terrier, Great Pyrenees, Miniature schnauzer, Saluki, and all semi-domestic dogs). The breeds that were classified as “seeks attention” were those that typically attempted to engage with humans, familiar or unfamiliar (41). It was not required that these breeds be gregarious or hyper-social, in that they actively seek any human attention; rather, that they show preference for working with humans, spending time, receiving affection, or offering behaviors to human counterparts. Conversely, the breeds that “avoid attention” are those that would classically be categorized as “aloof” or “independent”. They were either bred to exist on the periphery of human life, or tend to opt for individual pursuits.
Genome-wide SNP genotypes were collected using the Affymetrix Axiom K9HDSNPA (643,641 loci) and Axiom K9HDSNPB (625,577 loci) arrays with an average concentration of 26.5 ng/uL for 11 of the 24 individuals with behavioral phenotypes (ndog=5; nwolf=6). Samples with a dish QC value≥0.82 and call rate≥97% were retained. SNP genotype quality control and processing identified that 794,665 SNPs, 56.3% of K9HDSNPA (250,545 loci) and 87% of K9HDSNPB (544,120 loci), passed filtering metrics. Affymetrix recommended a subset of 544,120 loci (referred to as 544K SNPs) to be included for all downstream analyses. PLINK was used to obtain a pruned set of 25,510 uncorrelated and unlinked SNPs with the argument—indep-pairwise 50 5 0.2, then conducted a PCA with the program flashPCA (101) (
All subjects were volunteered by their owners/caretakers and remained in their care throughout the study. Experimental procedures were evaluated and approved by Oregon State University IACUC, protocol #4444. Laboratory methods were conducted under the approved IACUC protocol #2008A-14 of Princeton University. Institutional IACUC guidelines were followed with animal subjects.
- 1. Frank, H., Evolution of canine information processing under conditions of natural and artificial selection. Zeitschrift fûr Tierpsychologie 53, 389-399 (1980).
- 2. Miklósi, Á., Topal, J., Csányi, V., Comparative social cognition: what can dogs teach us?Anim. Behav. 67, 995-1004 (2004).
- 3. Hare, B., Tomasello, M., Human-like social skills in dogs? Trends. Cogn. Sci. 9, 439-444 (2005).
- 4. Udell, M. A., Dorey, N. R., Wynne, C. D., What did domestication do to dogs? A new account of dogs'"'"' sensitivity to human actions. Biol. Rev. 85, 327-345 (2010).
- 5. Trut, L., Oskina, I., Kharlamova, A., Animal evolution during domestication: the domesticated fox as a model. Bioessays 31(3), 349-360 (2009).
- 6. Nagasawa, M., Mitsui, S., En, S., Ohtani, N., Ohta, M., Sakuma, Y., Onaka, T., Mogi, K., Kikusui, T., Oxytocin-gaze positive loop and the coevolution of human-dog bonds. Science 348, 333-336 (2015).
- 7. Bentosela, M., Wynne, C. D., D'"'"'Orazio, M., Elgier, A., Udell, M. A. R., Sociability and gazing toward humans in dogs and wolves: Simple behaviors with broad implications. J. Exp. Anal. Behav. 105, 68-75 (2016).
- 8. Udell, M. A., When dogs look back: inhibition of independent problem-solving behaviour in domestic dogs (Canis lupus familiaris) compared with wolves (Canis lupus). Biol. Letters 11, 20150489 (2015).
- 9. Jones, P., Chase, K., Martin, A., Davern, P., Ostrander, E. A., Lark, K. G., Single-nucleotide-polymorphism-based association mapping of dog stereotypes. Genetics 179(2), 1033-1044 (2008).
- 10. Parker, H. G., Kim, L. V., Sutter, N. B. Carlson, S., Lorentzen, T. D., Malek, T. B., Johnson, G. S., DeFrance, H. B., Ostrander, E. A., Kruglyak, L., Genetic structure of the purebred domestic dog. Science 304, 1160-1164 (2004).
- 11. Serpell, J. A., Hsu, Y., Effects of breed, sex, and neuter status on trainability in dogs. Anthrozoös 18, 196-207 (2005).
- 12. Svartberg, K., Breed-typical behavior in dogs—historical remnants or recent constructs?Appl. Anim. Behav. Sci. 96, 293-313 (2006).
- 13. Duffy, D. L., Hsu, Y., Serpell, J. A., Breed differences in canine aggression. Appl. Anim. Behav. Sci. 114, 441-460 (2008).
- 14. Ley, J. M., Bennett, P. M., Coleman, G. J., A refinement and validation of the Monash Canine Personality Questionnaire (MCPQ). Appl. Anim. Behav. Sci. 116, 220-227 (2009).
- 15. Turcsán, B., Kubinyi, E., Miklósi, A., Trainability and boldness traits differ between dog breed clusters based on conventional breed categories and genetic relatedness. Appl. Anim. Behav. Sci. 132, 61-70 (2011).
- 16. Vaysse, A., Ratnakumar, A., Derrien, T., Axelsson, E., Rosengren Pielberg, G., Sigurdsson, S., Fall, T., Seppälä, E. H., Hansen, M. S. T., Lawley, C. T., Karlsson, E. K., The LUPA Consortium, Bannasch, D., Vilà, C., Lohi, H., Galibert, F., Fredholm, M., Häggström, J., Hedhammar, A., André, C., Lindblad-Toh, K., Hitte, C., Webster, M. T., Identification of genomic regions associated with phenotypic variation between dog breeds using selection mapping. PLoS Genet. 7(10), e1002316 (2011).
- 17. Serpell, J. A., Duffy, D. L., Dog breeds and their behavior. A. Horowitz (ed.), Domestic Dog Cognition and Behavior, Springer p 31-57 (2014).
- 18. Persson, M. E., Roth, L. S. V., Johnson, M., Wright, D., Jensen, P., Human-directed social behavior in dogs shows significant heritability. Genes Brain Behav. 14, 337-344 (2015).
- 19. vonHoldt, B. M., Pollinger, J. P., Lohmueller, K. E., Han, E., Parker, H. G., Quignon, P., Degenhardt, J. D., Boyko, A. R., Earl, D. A., Auton, A., Reynolds, A., Bryc, K., Brisbin, A., Knowles, J. C., Mosher, D. S., Spady, T. C., Elkahloun, A., Geffen, E., Pilot, M., Jedrzejewski, W., Greco, C., Randi, E., Bannasch, D., Wilton, A., Shearman, J., Musiani, M., Cargill, M., Jones, P. G., Qian, Z., Huang, W., Ding, Z.-L., Zhang, Y. P., Bustamante, C. D., Ostrander, E. A., Novembre, J., Wayne, R. K., Genome-wide SNP and haplotype analyses reveal a rich history underlying dog domestication. Nature 464, 898-902 (2010).
- 20. Schubert, C., The genomic basis of the Williams-Beuren syndrome. Cell. Mol. Life Sci. 66, 1178-1197 (2009).
- 21. Meyer-Lindenberg, A., Mervis, C. B., Berman, K. F., Neural mechanisms in Williams syndrome: a unique window to genetic influences on cognition and behaviour. Nat. Rev. Neurosci. 7, 380-393 (2006).
- 22. Jones, W., Bellugi, U., Lai, Z., Chiles, M., Reilly, J., Lincoln, A., Adolphs, R., II. Hypersociability in Williams syndrome. J. Cognitive Neurosci. 12, 30-46 (2000).
- 23. Ewart, A. K., Morris, C. A., Atkinson, D., Jin, W., Sternes, K., Spallone, P., Stock, A. D., Leppert, M., Hemizygosity at the elastin locus in a developmental disorder, Williams syndrome. Nat. Genet. 5, 11-16 (1993).
- 24. Wan, M., Hejjas, K., Ronai, Z., Elek, Z., Sasvari-Szekely, M., Champagne, F. A., Miklósi, Á., Kubinyi, E., DrD4 and TH gene polymorphisms are associated with activity, impulsivity and inattention in Siberian Husky dogs. Anim. Genet. 44, 717-727 (2013).
- 25. Kis, A., Bence, M., Lakatos, G., Pergel, E., Turcsan, B., Pluijmakers, J., Vas, J., Elek, Z., Bruder, I., Foldi, L., Sasvari-Szekely, M., Miklósi, A., Ronai, Z., Kubinyi, E., Oxytocin receptor gene polymorphisms are associated with human directed social behavior in dogs (Canis familiaris). PLoS One 9(1): e83993. doi:10.1371/journal.pone.0083993 (2014).
- 26. Jakovcevic, A., Mustaca, A., Bentosela, M., Do more sociable dogs gaze longer to the human face than less sociable ones? Behav. Process. 90, 217-222 (2012).
- 27. Bentosela, M., Wynne, C. D., D'"'"'Orazio, M., Elgier, A., Udell, M. A., Sociability and gazing toward humans in dogs and wolves: Simple behaviors with broad implications. J. Exp. Anal. Behav. 105, 68-75 (2016).
- 28. Brubaker, L., Dasgupta, S., Bhattacharjee, D., Bhadra, A., Udell, M. A. R., Differences in problem-solving between canid populations: Do domestication and lifetime experience affect persistence? Anim. Cogn. https://doi.org/10.1007/s10071-017-1093-7 (2017).
- 29. Walsh, T., McClellan, J. M., McCarthy, S. E., Addington, A. M., Pierce, S. B., Cooper, G. M., Nord, A. S., Kusenda, M., Malhotra, D., Bhandari, A., Stray, S. M., Rippey, C. F., Roccanova, P., Makarov, V., Lakshmi, B., Findling, R. L., Sikich, L., Stromberg, T., Merriman, B., Gogtay, N., Butler, P., Eckstrand, K., Noory, L., Gochman, P., Long, R., Chen, Z., Davis, S., Baker, C., Eichler, E. E., Meltzer, P. S., Nelson, S. F., Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia. Science 320, 539-543 (2008).
- 30. Cuscó, I., Corominas, R., Bayés, M., Flores, R., Rivera-Brugués, N., Campuzano, V., Perez-Jurado, L. A., Copy number variation at the 7q11. 23 segmental duplications is a susceptibility factor for the Williams-Beuren syndrome deletion. Genome Res. 18, 683-694 (2008).
- 31. Wong, K., Keane, T. M., Stalker, J., Adams, D. J., Enhanced structural variant and breakpoint detection using SVMerge by integration of multiple detection methods and local assembly. Genome Biol. 11, R128 (2010).
- 32. Hart, S. N., Sarangi, V., Moore, R., Baheti, S., Bhavsar, J. D., Couch, F. J., Koher, J.-P. A., SoftSearch: integration of multiple sequence features to identify breakpoints of structural variations. PLoS One 8, e83356 (2013).
- 33. Qi, J., Zhao, F., inGAP-sv: a novel scheme to identify and visualize structural variation from paired end mapping data. Nucleic Acids Res. 39, W567-W575 (2011).
- 34. Korenberg, J. R., Chen, X.-N., Hirota, H., VI. Genome structure and cognitive map of Williams Syndrome. J. Cognitive Neuroci. 12(1), 89-107 (2000).
- 35. Young, E. J., Lipina, T., Tam, E., Mandel, A., S. Clapcote, J., Bechard, A. R., Chambers, J., Mount, H. T. J., Fletcher, P. J., Roder, J. C., Osborne, L. R., Reduced fear and aggression and altered serotonin metabolism in Gtf2ird1-tagged mice. Genes Brain Behav. 7, 224-234 (2008).
- 36. Li, H. H., Roy, M., Kuscuoglu, U., Spencer, C. M., Halm, B., Harrison, K. C., Bayle, J. H., Splendore, A., Ding, F., Meltzer, L. A., Wright, E., Paylor, R., Deisseroth, K., Francke, U., Induced chromosome deletions cause hypersociability and other features of Williams-Beuren syndrome in mice. Mol. Med 1, 50-65 (2009).
- 37. Doyle, T. F., Bellugi, U., Korenberg, J. R., Graham, J., “Everybody in the world is my friend” hypersociability in young children with Williams syndrome. Am. J. Med Genet. A 124, 263-273 (2004).
- 38. Edelmann, L., Prosnitz, A., Pardo, S., Bhatt, J., Cohen, N., Lauriat, T., Ouchanov, L., Gonzalez, P. J., Manghi, E. R., Bondy, P., Esquivel, M., Monge, S., Delgado, M. F., Splendore, A., Francke, U., Burton, B. K., McInnes, L. A., An atypical deletion of the Williams-Beuren syndrome interval implicates genes associated with defective visuospatial processing and autism. J. Med Genet. 44, 136-143 (2007).
- 39. McLaren, W., Pritchard, B., Rios, D., Chen, Y., Flicek, P., Cunningham, F., Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26, 2069-2070 (2010).
- 40. Hubbard, T., Barker, D., Birney, E., Cameron, G., Chen, Y., Clark, L., Cox, T., Cuff, J., Curwen, V., Down, T., Durbin, R., Eyras, E., Gilbert, J., M Hammond, Huminiecki, L., Kasprzyk, A., Lehvaslaiho, H., Lijnzaad, P., Melsopp, C., Mongin, E., Pettett, R., Pocock, M., Potter, S., Rust, A., Schmidt, E., Searle, S., Slater, G., Smith, J., Spooner, W., Stabenau, A., Stalker, J., Stupka, E., Ureta-Vidal, A., Vastrik, I., Clamp, M., The Ensembl genome database project. Nucleic Acids Res. 30, 38-41 (2002).
- 41. American Kennel Club, The New Complete Dog Book: Official Breed Standards and All-New Profiles for 200 Breeds (21st Edition). Irvine, Calif.: Lumina Media (2014).
- 42. Bayes, M., Magano, L. F., Rivera, N., Flores, R., Perez Jurado, L. A., Mutational mechanisms of Williams-Beuren syndrome deletions. Am. J. Hum. Genet. 73, 131-151 (2003).
- 43. Reymond, A., Henrichsen, C. N., Harewood, L., Merla, G., Side effects of genome structural changes. Curr. Opin. Genet. Dev. 17, 381-386 (2007).
- 44. Merla, G., Micale, L., Fusco, C., Loviglio, M. N., Molecular Genetics of Williams-Beuren Syndrome. In: eLS. John Wiley & Sons, Ltd: Chichester (2012).
- 45. Bayarsaihan, D., Ruddle, F. H., Isolation and characterization of BEN, a member of the TFII-I family of DNA-binding proteins containing distinct helix-loop-helix domains. Proc. Natl. Acad. Sci. USA 97(13), 7342-7347 (2000).
- 46. Tipney, H. J., Hinsley, T. A., Brass, A., Metcalfe, K., Donnai, D., Tassabehji, M., Isolation and characterization of GTF2IRD2, a novel fusion gene and member of the TFII-I family of transcription factors, deleted in William-Beuren syndrome. Eur. J. Hum. Genet. 12, 551-560 (2004).
- 47. Tassabehji, M., Hammond, P., Karmiloff-Smith, A., Thompson, P., Thorgeirsson, S. S., Durkin, M. E., Popescu, N.C., Hutton, T., Metcalfe, K., Rucka, A., Stewart, H., Read, A. P., Maconochie, M., Donnai, D., GTF2IRD1 in craniofacial development of humans and mice. Science 310(5751), 1184-1187 (2005).
- 48. Chimge, N. O., Makeyev, A. V., Ruddle, F. H., Bayarsaihan, D., Identification of the TFII-I family target genes in the vertebrate genome. Proc. Natl. Acad. Sci. USA 105(26), 9006-9010 (2008).
- 49. Porter, M. A., Dobson-Stone, C., Kwok, J. B., Schofield, P. R., Beckett, W., Tassabehji, M., A role for transcription factor GTF2IRD2 in executive function in Williams-Beuren Syndrome. PLoS One 7(10), e47457 (2012).
- 50. Sakurai, T., Dorr, N. P., Takahash., N., McInnes, L. A., Elder, G. A., Buxbaum, J. D., Haploinsufficiency of Gtf2i, a gene deleted in Williams Syndrome, leads to increases in social interactions. Autism Res. 4, 28-39 (2011).
- 51. Procyshyn, T. L., Spence, J., Read, S., Watson, N. V., Crespi, B. J., The Williams syndrome prosociality gene GTF2I mediates oxytocin reactivity and social anxiety in a healthy population. Biol. Letters 13(4), http://dx.doi.org/10.1098/rsbl.2017.0051 (2017).
- 52. Merla, G., Howald, C., Henrichsen, C. N., Lyle, R., Wyss, C., Zabot, M. T., Antonarakis, S. E., Reymond, A., Submicroscopic deletion in patients with Williams-Beuren syndrome influences expression levels of the nonhemizygous flanking genes. Am. J. Hum. Genet. 79(2), 332-341 (2006).
- 53. Li, H. H., Roy, M., Kuscuoglu, U., Spencer, C. M., Halm, B., Harrison, K. C., Joseph H Bayle, Alessandra Splendore, Feng Ding, Leslie A Meltzer, Elena Wright, Richard Paylor, Karl Deisseroth, and Uta Francke. Induced chromosome deletions cause hypersociability and other features of Williams-Beuren syndrome in mice. Embo Mol. Med. 1(1), 50-65 (2009).
- 54. Lau, K. S., Khan, S., Dennis, J. W., Genome-scale identification of UDP-GlcNAc-dependent pathways. Proteomics 8(16), 3294-3302 (2008).
- 55. Axelsson, E., Ratnakumar, A., Arendt, M.-L., Maqbool, K., Webster, M. T., Perloski, M., Liberg, O., Arnemo, J. M., Hedhammar, A., Lindblad-Toh, K., The genomic signature of dog domestication reveals adaptation to a starch-rich diet. Nature 495(7441), 360-364 (2013).
- 56. Cowley, M., Oakey, R. J., Transposable elements re-wire and fine-tune the transcriptome. PLOS Genet. 9(1), e1003234 (2013).
- 57. Wang, W., Kirkness, E. F., Short interspersed elements (SINEs) are a major source of canine genomic diversity. Genome Res. 15, 1798-1808 (2005).
- 58. Janowitz Koch, I., Clark, M. M., Thompson, M. J., Deere-Machemer, K. A., Wang, J., Duarte, L., Gnanadesikan, G. E., McCoy, E. L., Rubbi, L., Stahler, D. R., Pellegrini, M., Ostrander, E. A., Wayne, R. K., Sinsheimer, J. S, vonHoldt, B. M., The concerted impact of domestication and transposon insertions on methylation patterns between dogs and grey wolves. Mol. Ecol. 25(8), 1838-1855 (2016)
- 59. Lin, L., Faraco, J., Li, R. Kadotani, H., Rogers, W., Lin, X., Qiu, X., de Jong, P. J., Nishino, S., Mignot, E., The sleep disorder canine narcolepsy is caused by a mutation in the hypocretin (orexin) receptor 2 gene. Cell 98, 365-376 (1999).
- 60. Pele, M., Tiret, L., Kessler, J. L., Blot, S., Panthier, J. J., SINE exonic insertion in the PTPLA gene leads to multiple splicing defects and segregates with the autosomal recessive centronuclear myopathy in dogs. Hum. Mol. Genet. 14, 1417-1427 (2005).
- 61. Clark, L. A., Wahl, J. M., Rees, C. A., Murphy, K. E., Retrotransposon insertion in SILV is responsible for merle patterning of the domestic dog. Proc. Natl. Acad. Sci. USA 103, 1376-1381 (2006).
- 62. Sutter, N. B., Bustamante, C. D., Chase, K., Gray, M. M., Zhao, K., Zhu, L., Padhukasahasram, B., Karlins, E., Davis, S., Jones, P. G., Quignon, P., Johnson, G. S., Parker, H. G., Fretwell, N., Mosher, D. S., Lawler, D. F., Satyaraj, E., Nordborg, M., Lark, K. G., Wayne, R. K., Ostrander, E. A., A single IGF1 allele is a major determinant of small size in dogs. Science 316(5821), 112-115 (2007).
- 63. Parker, H. G., vonHoldt, B. M., Quignon, P., Margulies, E. H., Shao, S., Mosher, D. S., Spady, T. C., Elkahloun, A., Cargill, M., Jones, P. G., Maslen, C. L. Acland, G. M., Sutter, N. B., Kuroki, K., Bustamante, C. D., Wayne, R. K., Ostrander, E. A., An expressed fgf4 retrogene is associated with breed-defining chondrodysplasia in domestic dogs. Science 325(5943), 995-998 (2009).
- 64. Gray, M. M. Sutter, N. B., Ostrander, E. A., Wayne, R. K., The IGF1 small dog haplotype is derived from Middle Eastern grey wolves. BMC Biology 8, 16 (2010).
- 65. Karlsson, E. K., Lindblad-Toh, K., Leader of the pack: gene mapping in dogs and other model organisms. Nat. Rev. Genet. 9, 713-725 (2008).
- 66. Boyko, A. R., The domestic dog: man'"'"'s best friend in the genomic era. Genome Biology 12, 216 (2011).
- 67. Anderson, T. M., vonHoldt, B. M., Candille, S. I., Musiani, M., Greco, C., Stahler, D. R., Smith, D. W., Padhukasahasram, B., Randi, E., Leonard, J. A., Bustamante, C. D., Ostrander, E. A., Tang, H., Wayne, R. K., Barsh, G. S., Molecular and evolutionary history of melanism in North American gray wolves. Science 323(5919), 1339-1343 (2009).
- 68. Frank, H., Frank, M. G., On the effects of domestication on canine social development and behavior. Appl. Anim. Ethol. 8, 507-525 (1982).
- 69. Udell, M. A. R., Dorey, N. R., Wynne, C. D. L., The performance of stray dogs (Canis familiaris) living in a shelter on human-guided object-choice tasks. Anim. Behav. 79, 717-725 (2010).
- 70. Klinghammer, E., Goodman, P., Socialization and management of wolves in captivity. In H. Frank (Ed.), Man and Wolf: Advances, Issues, and Problems in Captive Wolf Research. Springer (1987).
- 71. Udell, M. A., Dorey, N. R., Wynne, C. D., Wolves outperform dogs in following human social cues. Anim. Behav. 76, 1767-1773 (2008).
- 72. McHugh, M. L., Interrater reliability: the kappa statistic. Biochem. medica 22, 276-282 (2012).
- 73. Udell, M. A. R., Dorey, N. R., Wynne, C. D. L., Wolves outperform dogs in following human social cues. Anim. Behav. 76, 1767-1773 (2008).
- 74. Raiche, G., nFactors: An R package for parallel analysis and non graphical solutions to the Cattell scree test. R package version 2 (2010).
- 75. Aschard, H., Vilhjálmsson, B. J., Greliche, N., Morange, P. E., Tregouet, D. A., Maximizing the power of principal-component analysis of correlated phenotypes in genome-wide association studies. Am. J. Hum. Genet. 94, 662-676 (2014).
- 76. Cunningham, F., Amode, M. R., Barrell, D., Beal, K., Billis, K., Brent, S., Carvalho-Silva, D., Clapham, P., Coates, G., Fitzgerald, S., Gil, L., C. Girón, Garcí., Gordon, L., Hourlier, T., Hunt, S. E., Janacek, S. H., Johnson, N., Juettemann, T., Kähäri, A. K., Keenan, S., Martin, F. J., Maurel, T., McLaren, W., Murphy, D. N., Nag, R., Overduin, B., Parker, A., Patricio, M., Perry, E., Pignatelli, M., Riat, H. S., Sheppard, D., Taylor, K., Thormann, A., Vullo, A., Wilder, S. P., Zadissa, A., Aken, B. L., Birney, E., Harrow, J., Kinsella, R., Muffato, M., Ruffier, M., Searle, S. M. J., Spudich, G., Trevanion, S. J., Yates, A., Zerbino, D. R., Flicek, P., Ensembl 2015. Nucleic Acids Res. 43, D662-D669 (2015).
- 77. Martin, M., Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet Journal 17, pp. 10-12 (2011).
- 78. Li, H., Durbin, R., Fast and accurate short read alignment with Burrows-Wheeler Transform. Bioinformatics 25, 1754-1760 (2009).
- 79. Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., 1000 Geome Project Data Processing Subgroup, The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078-2079 (2009).
- 80. Korneliussen, T. S., Albrechtsen, A., Nielsen, R., ANGSD: analysis of next generation sequencing data. BMC Bioinformatics 15, 356 (2014).
- 81. Delaneau, O., Marchini, J., Zagury, J.-F., A linear complexity phasing method for thousands of genomes. Nat. Methods 9, 179-181 (2012).
- 82. Sabeti, P. C., Varilly, P., Fry, B., Lohmueller, J., Hostetter, E., Cotsapas, C., Xie, X., Byrne, E. H., McCarroll, S. A., Gaudet, R., Schaffner, S. F., Lander, E. S., The International HapMap Consortium, Genome-wide detection and characterization of positive selection in human populations. Nature 449, 913-918 (2007).
- 83. Chen, K., Wallis, J. W., McLellan, M. D., Larson, D. E., Kalicki, J., Pohl, C. S., McGrath, S. D., Wendl, M. C., Zhang, Q., Locke, D. P., Shi, X., Fulton, R. S., Ley, T. J., Wilson, R. K., Ding, L., Mardis, E. R., BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat. Methods 6, 677-681 (2009).
- 84. Ye, K., Schulz, M. H., Long, Q., Apweiler, R., Ning, Z., Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25, 2865-2871 (2009).
- 85. Wong, K., Keane, T. M., Stalker, J., Adams, D. J., Enhanced structural variant and breakpoint detection using SVMerge by integration of multiple detection methods and local assembly. Genome Biol. 11, R128 (2010).
- 86. Hart, S. N., Sarangi, V., Moore, R., Baheti, S., Bhavsar, J. D., SoftSearch: integration of multiple sequence features to identify breakpoints of structural variations. PLoS One 8, e83356 (2013).
- 87. Tattini, L., D'"'"'Aurizio, R., Magi, A., Detection of genomic structural variants from next-generation sequencing data. Frontiers Bioeng. Biotechnol. 3 (2015).
- 88. Qi, J., Zhao, F., inGAP-sv: a novel scheme to identify and visualize structural variation from paired end mapping data. Nucleic Acids Res. 39, W567-W575 (2011).
- 89. Hollox, E. J., “The challenges of studying complex and dynamic regions of the human genome” in Genomic Structural Variants (Spring, New York), pp. 187-207 (2012).
- 90. Quinlan, A. R., Hall I. M., Characterizing complex structural variation in germline and somatic genomes. Trends Genet. 28, 43-53 (2012).
- 91. Kent, W. J., Sugnet, C. W., Furey, T. S., Roskin, K. M., Pringle, T. H., Zahler, A. M., Haussler, D., The human genome browser at UCSC. Genome Res. 12, 996-1006 (2002).
- 92. Decker B., Davis, B. W., Rimbault, M., Long, A. H., Karlins, E., Jagannathan, V., Reiman, R., Parker, H. G., Drögmüeller, C., Corneveaux, J. J., Chapman, E. S., Trent, J. M., Leeb, T., Huentelman, M. J., Wayne, R. K., Karyali, D. M., Ostrander, E. A., Comparison against 186 canid whole-genome sequences reveals survival strategies of an ancient clonally transmissible canine tumor. Genome Res. 25, 1646-1655 (2015).
- 93. Zhou, X., Stephens, M., Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44, 821-824 (2012).
- 94. Stich, B., Möhring, J., Piepho, H.-P., Heckenberger, M., Buckler, E. D., Melchinger, A. E., Comparison of mixed-model approaches for association mapping. Genetics 178, 1745-1754 (2008).
- 95. Mandel, J. R., Nambeesan, S., Bowers, J. E., Marek, L. F., Ebert, D., Rieseberg, L. H., Knapp, S. J., Burke, J., Association mapping and the genomic consequences of selection in sunflower. PLoS Genet. 9, e1003378 (2013).
- 96. Tabangin, M. E., Woo, J. G., Martin, L. J., The effect of minor allele frequency on the likelihood of obtaining false positives. BMC Proceedings 3, S41 (2009).
- 97. Gao, X., Starmer, J., Martin, E. R., A multiple testing correction method for genetic association studies using correlated single nucleotide polymorphisms. Genet. Epidemiol. 32, 361-369 (2008).
- 98. Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A., Bender, D., Maller, J., Sklar, P., P. I. Bakker, d., Daly, M. J., Sham, P. C., PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559-575 (2007).
- 99. Untergasser, A., Cutcutache, I., Koressaar, T., Ye, J., Faircloth, B. C., Remm, M., Rozen, S. G., Primer3—New capabilities and interfaces. Nucleic Acids Res. 40(15), e115 (2012).
- 100. Zou, Q., Wang, X., Liu, Y., Ouyang, Z., Long, H., Wei, S., Xin, J., Zhao, B., Lai, S., Shen, J., Ni, Q., Yang, H., Zhong, H., Li, L., Hu, M., Zhang, Q., Zhou, Z., He, J., Yan, Q., Fan, N., Zhao, Y., Liu, Z., Guo, L., Huang, J., Zhang, G., Ying, J., Lai, L., Gao, X., Generation of gene-target dogs using CRISPR/Cas9 system. J. Mol. Cell. Biol. 7(6), 580-583 (2015).
- 101. Abraham, G., Inouye, M., Fast principal component analysis of large-scale genome-wide data. PLoS One 9, e93766 (2014).
- 102. Zhang, B., Kirov, S. A., Snoddy, J. R., WebGestalt: an integrated system for exploring gene sets in various biological contexts. Nucleic Acids Res. 33 (Web Server Issue), W741-748 (2005).
- 103. Wang, J., Duncan, D., Shi, Z., Zhang, B., WEB-based GEne SeT AnaLysis Toolkit (WebGestalt): update 2013. Nucleic Acids Res. 41 (Web Server Issue), W77-83 (2013).
- 104. Benjamini, Y., Hochberg, Y., Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. BMet. 57, 289-300 (1995).
- 105. Fusco, C., Micale, L., Augello, B., Pellico, T., Menghini, D., Alferi, P., Digilio, M. C., Mandriani, B., Carella, M., Palumbo, O., Vicari, S., Merla, G., Smaller and larger deletions of the Williams Beuren syndrome region implicate genes involved in mild facial phenotype, epilepsy and autistic traits. Eur. J. Hum. Genet. 22, 64-70 (2014).