User Tools

Site Tools


Genetic Data- Sequence Analysis

MeSH ID: D017421

A multistage process that includes the determination of a sequence (protein, carbohydrate, etc.), its fragmentation and analysis, and the interpretation of the resulting sequence information.

Best practice for sharing this type of data:
Datasets are typically stored and shared as .bam, .vcf, or .fastq files. All important datasets should be made available, along with additional textual information which can be shared as a .txt file (eg. relevant genetic maps). If the study involves human data, then ethical considerations around sharing need to be evaluated: Subject Data Table (Tabular data).

Most suitable repositories:
DNA sequence analyses may be added to DDBJ, ENA, NCBI Gene, Reference Sequence Database, NCBI Genome, Genetic Testing Registry, GenBank, Genomic Expression Archive, Pfam, and UniProt KnowledgeBase.

Best practice for indicating re-use of existing data:
For public datasets please provide a DOI or other stable identified for the dataset itself *and* include a citation for the dataset in the reference list. Be sure to indicate exactly which data has been re-used, particularly when multiple versions of the dataset exist. In many cases, this is best achieved by sharing the code used to extract the part of the data that you analyzed. In some cases it may be best to share the exact dataset(s) you analyzed as well.

For access-controlled data authors should provide a link to instructions for obtaining access (e.g. here is the information page for ADNI (Alzheimer's Disease Neuroimaging Initiative):

When re-using a private dataset from a previous study please contact the data owners to discuss how the data can be made public.

data_type/genetic_data/sequence_analysis.txt · Last modified: 2022/07/08 05:33 by souad