Adding leaves to the Lepidoptera tree: capturing hundreds of nuclear genes from old museum specimens
Museum collections around the world contain billions of specimens, including rare and extinct species. If their genetic information could be retrieved at a large scale, this would dramatically increase our knowledge of genetic and taxonomic diversity information, and support evolutionary, ecological and systematic studies. We here present a target enrichment kit for 2953 loci in 1753 orthologous nuclear genes + the barcoding region of cytochrome C oxidase 1, for Lepidoptera and demonstrate its utility to obtain a large number of nuclear loci from dry, pinned museum material collected from 1892 to 2017. We sequenced enriched libraries of 37 museum specimens across the order Lepidoptera, many from higher taxa not yet included in high‐throughput molecular studies, showing that our kit can be used to generate comparable data across the order, and provides resolution both for shallower and deeper nodes. The filtered datasets (172 taxa, 234 464 amino acid positions and corresponding nucleotides from 1835 CDS regions) were used to infer a phylogeny of Lepidoptera, which is largely congruent in topology to recent phylogenomic studies, but with the addition of some key taxa. We furthermore present our TEnriAn (Target Enrichment Analysis) workflow for processing and combining target enrichment, transcriptomic and genomic data.