2413.02 Form and Format of the XML file containing the “Sequence Listing XML” [R-07.2022]

2413.02 Form and Format of the XML file containing the “Sequence Listing XML” [R-07.2022]

[Editor Note: This section is applicable to all applications filed on or after July 1, 2022, having disclosures of nucleotide and/or amino acid sequences as defined in 37 CFR 1.831(b).]

37 CFR 1.834 Form and format for nucleotide and/ or amino acid sequence submissions as the “Sequence Listing XML” in patent applications filed on or after July 1, 2022.

  • (a) A “Sequence Listing XML” encoded using Unicode UTF–8, created by any means (e.g., text editors, nucleotide/amino acid sequence editors, or other custom computer programs) in accordance with §§ 1.831 through 1.833, must:
    • (1) Have the following compatibilities:
      • (i) Computer compatibility: PC or Mac®; and
      • (ii) Operating system compatibility: MS–DOS®, MS-Windows®, Mac OS®, or Unix®/Linux®.
    • (2) Be in XML format, where all permitted printable characters (including the space character) and nonprintable (control) characters are defined in paragraph 40 of WIPO Standard ST.26 (incorporated by reference, see § 1.839).
    • (3) Be named as *.xml, where “*” is one character or a combination of characters limited to upper- or lowercase letters, numbers, hyphens, and underscores, and the name does not exceed 60 characters in total, excluding the extension. No spaces or other types of characters are permitted in the file name.
  • *****

In order for the USPTO to be able to process the .xml file containing the “Sequence Listing XML”, all characters must be encoded using Unicode UTF-8. The file must be compatible with PC or Mac® computers using one of the following operating systems, MS–DOS®, MS-Windows®, Mac OS®, or Unix®/Linux®. The printable and non-printable characters in the .xml file are defined in paragraph 40 and 41 of WIPO Standard ST.26 (see MPEP § 2413.01(a)) where Annex IV of WIPO Standard ST.26 provides a table of the CHARACTER SUBSET FROM THE UNICODE BASIC LATIN CODE TABLE FOR USE IN AN XML INSTANCE OF A SEQUENCE LISTING.