2423.03 Presentation and Numbering of Sequences [R-07.2022]

[Editor Note: This section is not applicable to applications filed on or after July 1, 2022, having disclosures of nucleotide and/or amino acid sequences as defined in 37 CFR 1.831(b). See MPEP §§ 24122419 for guidance on WIPO ST.26 requirements for applications filed on or after July 1, 2022.]

37 CFR 1.822(c)(5) provides that nucleotide sequences shall only be represented by a single strand, in the 5′ to 3′ direction, from left to right. That is, double stranded nucleotides shall not be represented in the “Sequence Listing”. A double stranded nucleotide may be represented as two single stranded nucleotides, and any relationship between the two may be shown in the drawings.

The procedures for presenting and numbering amino acid sequences are set forth in 37 CFR 1.822(d). Two alternatives are presented for numbering amino acid sequences. Amino acid sequences may be numbered with respect to the identification of the first amino acid of the first mature protein or with respect to the first amino acid appearing at the amino terminal. The numbering procedure for nucleotides is set forth in 37 CFR 1.822(c)(6). Sequences that are circular in configuration are intended to be encompassed by these rules, and the numbering procedures described above remain applicable with the exception that the designation of the first nucleotide base or amino acid of the sequence may be made at the option of the applicant. See 37 CFR 1.822(c)(7) and (d)(4).

In 37 CFR 1.822(e) the procedures for presenting and numbering hybrid and gapped sequences are set forth. A sequence with a gap or gaps shall be presented as a plurality of separate sequences, each having separate sequence identifiers, with the number of separate sequences being equal in number to the number of continuous strings of sequence data. The term “gap” is not intended to embrace a gap or gaps that is/are introduced into the presentation of otherwise continuous sequence information in, e.g., a drawing figure, to show alignments or similarities with other sequences. The “gaps” referred to in this section are gaps representing unknown or undisclosed regions in a sequence between regions that are known or disclosed. On the other hand, a sequence that contains one or more regions of contiguous “n” or “Xaa” residues, wherein the exact number of “n” or “Xaa” residues in each region is disclosed, must be included in the “Sequence Listing” as a single sequence with a single sequence identifier. A sequence disclosed by enumeration of its residues that is constructed as a single continuous sequence from one or more non-contiguous segments of a larger sequence or segments from different sequences must be included in the “Sequence Listing” as a single sequence with a single sequence identifier. A fragment of a larger sequence need not be enumerated by its residues, and may be referred to in the specification, claims or drawings as, e.g., “residues 2 through 33 of SEQ ID NO:12,” assuming that SEQ ID NO:12 has been properly included in the “Sequence Listing”.