This is an old revision of the document!
This tutorial required:
Data in text format used to store nucleic acid sequences (such as DNA sequences) or protein sequences; may contain multiple sequences. FASTA files often start with a header line that may contain comments or other information. The rest of the file contains sequence data. Each sequence starts with a “>” symbol followed by the name of the sequence. The rest of the line describes the sequence and the remaining lines contain the sequence itself. Example :
>human_T1 (UCSC April 2002 chr7:115977709:)
TTGTCAGATTCACCAAAGTTGAAATGAAGGAAAAAATGCTAAGGGCAGCC
AGAGAGAGGTCAGGTTACCCACAAAGGGAAGCCCATCAGACTAACAGCG
ATCTCTCGGCAGAAACCCTACAGGCCAGAAGAGAGTGGGGGCCAATATT
CATATTCTTAAAGAAAAGAATTTTCAACCCAGAATTTCATATCCAGCCAA
>human_T2 (UCSC April 2002 chr5:11977710:) ATACGACYCTTATTGTTAGTATATAATTTATATGAAAACMAAAAATTATG GCGGTATTTTAAGCTTTTCAGAGGAATTTGCTCTTTAATGGATAAAAC CTAAATCTTACTAGAATTAGTAAAGCAGTTTGTATACCACT