Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
List of file formats
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Biology === Molecular biology and bioinformatics: * AB1 β In [[DNA sequencing]], [[chromatogram]] files used by instruments from [[Applied Biosystems]] * ACE β A [[sequence assembly]] format * ASN.1 β [[Abstract Syntax Notation One]], is an International Standards Organization ([[International Organization for Standardization|ISO]]) data representation format used to achieve interoperability between platforms. [[National Center for Biotechnology Information|NCBI]] uses ASN.1 for the storage and retrieval of data such as nucleotide and protein sequences, structures, genomes, and PubMed records. * BAM β [[Binary Alignment/Map]] format (compressed SAM format) * BCF β Binary compressed VCF format * BED β The [[BED file format|browser extensible display format]] is used for describing [[gene]]s and other features of [[DNA]] sequences * CAF β Common Assembly Format for [[sequence assembly]] * [[CRAM (file format)|CRAM]] β compressed file format for storing biological sequences aligned to a reference sequence * DDBJ β The flatfile format used by the [[DNA Data Bank of Japan|DDBJ]] to represent database records for [[nucleotide sequence|nucleotide]] and [[peptide sequence]]s from [[DDBJ]] databases. * EMBL β The flatfile format used by the [[European Molecular Biology Laboratory|EMBL]] to represent database records for [[nucleotide sequence|nucleotide]] and [[peptide sequence]]s from [[EMBL]] databases. * FASTA β The [[FASTA format]], for sequence data. Sometimes also given as FNA or FAA (Fasta Nucleic Acid or Fasta Amino Acid). * FASTQ β The [[FASTQ format]], for sequence data with quality. Sometimes also given as QUAL. * GCPROJ β The [[Genome Compiler]] project. Advanced format for genetic data to be designed, shared and visualized. * GenBank β The flatfile format used by the [[National Center for Biotechnology Information|NCBI]] to represent database records for [[nucleotide sequence|nucleotide]] and [[peptide sequence]]s from the [[GenBank]] and [[RefSeq]] databases * GFF β The [[General feature format]] is used to describe [[gene]]s and other features of [[DNA]], [[RNA]], and [[protein]] sequences * GTF β The [[Gene transfer format]] is used to hold information about [[gene]] structure * MAF β The [[Multiple Alignment Format]] stores multiple alignments for whole-genome to whole-genome comparisons [https://biopython.org/wiki/Multiple_Alignment_Format] * NCBI β Structured [[ASN.1]] format used at [[National Center for Biotechnology Information]] for DNA and protein data * NEXUS β The [[Nexus file]] encodes mixed information about genetic sequence data in a block structured format * NeXML β XML format for [[phylogenetic tree]]s * NWK β The [[Newick format|Newick tree format]] is a way of representing graph-theoretical trees with edge lengths using parentheses and commas and useful to hold [[phylogenetic tree]]s. * PDB β structures of biomolecules deposited in [[Protein Data Bank]], also used to exchange protein and nucleic acid structures * PHD β Phred output, from the base-calling software [[Phred (software)|Phred]] * PLN β Protein Line Notation used in [http://www.biochemfusion.com/products/ proteax software] [http://www.biochemfusion.com/doc/Biochemfusion_PLN_1.4_spec.pdf specification] * SAM β [[SAM (file format)|SAM]], Sequence Alignment Map format, in which the results of the [[1000 Genomes Project]] will be released * SBML β [[SBML|The Systems Biology Markup Language]] is used to store biochemical network computational models * SCF β Staden chromatogram files used to store data from [[DNA sequencing]] * SFF β [[Standard Flowgram Format]] * SRA β format used by the [[National Center for Biotechnology Information]] Short Read Archive to store high-throughput DNA sequence data * Stockholm β The [[Stockholm format]] for representing [[multiple sequence alignment]]s * Swiss-Prot β The flatfile format used to represent database records for [[peptide sequence|protein]] sequences from the [[Swiss-Prot]] database * VCF β [[Variant Call Format]], a standard created by the [[1000 Genomes Project]] that lists and annotates the entire collection of human variants (with the exception of approximately 1.6 million variants).
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)