Class HttpParser

java.lang.Object
org.apache.tomcat.util.http.parser.HttpParser

public class HttpParser extends Object
HTTP header value parser implementation. Parsing HTTP headers as per RFC2616 is not always as simple as it first appears. For headers that only use tokens the simple approach will normally be sufficient. However, for the other headers, while simple code meets 99.9% of cases, there are often some edge cases that make things far more complicated. The purpose of this parser is to let the parser worry about the edge cases. It provides tolerant (where safe to do so) parsing of HTTP header values assuming that wrapped header lines have already been unwrapped. (The Tomcat header processing code does the unwrapping.)
  • Constructor Summary

    Constructors
    Constructor
    Description
    HttpParser(String relaxedPathChars, String relaxedQueryChars)
    Creates a new HTTP parser with optional relaxed character sets for path and query.
  • Method Summary

    Modifier and Type
    Method
    Description
    static boolean
    Checks if the given character is valid for an absolute path as per RFC 3986.
    boolean
    Checks if the given character is valid for a relaxed absolute path.
    static boolean
    isAlpha(int c)
    Checks if the given character is an alphabetic character.
    static boolean
    isControl(int c)
    Checks if the given character is a control character.
    static boolean
    Checks if the given character is valid field-content as per RFC 7230.
    static boolean
    isFieldVChar(int c)
    Checks if the given character is a valid field-vchar as per RFC 7230.
    static boolean
    isHex(int c)
    Checks if the given character is a valid hexadecimal digit.
    static boolean
    Checks if the given character is valid for an HTTP protocol version string.
    static boolean
    Checks if the given character is not valid for a request target.
    boolean
    Checks if the given character is not valid for a relaxed request target.
    static boolean
    isNumeric(int c)
    Checks if the given character is a numeric digit.
    static boolean
    isQuery(int c)
    Checks if the given character is valid for a query string as per RFC 3986.
    boolean
    Checks if the given character is valid for a relaxed query string.
    static boolean
    isScheme(int c)
    Checks if the given character is valid for a URI scheme as per RFC 3986.
    static boolean
    Is the provided String a scheme as per RFC 3986?
    static boolean
    isToken(int c)
    Checks if the given character is a valid HTTP token character as per RFC 7230.
    static boolean
    Is the provided String a token as per RFC 7230?
    static boolean
    isUserInfo(int c)
    Checks if the given character is valid for a URI userinfo component as per RFC 3986.
    static boolean
    isWhiteSpace(int c)
    Checks if the given character is whitespace (tab or space).
    static String
    unquote(String input)
    Removes surrounding quotes from a string, handling escaped characters.

    Methods inherited from class Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • HttpParser

      public HttpParser(String relaxedPathChars, String relaxedQueryChars)
      Creates a new HTTP parser with optional relaxed character sets for path and query.
      Parameters:
      relaxedPathChars - Additional characters to allow in the path, or null
      relaxedQueryChars - Additional characters to allow in the query, or null
  • Method Details

    • isNotRequestTargetRelaxed

      public boolean isNotRequestTargetRelaxed(int c)
      Checks if the given character is not valid for a relaxed request target.
      Parameters:
      c - the character to check
      Returns:
      true if the character is not valid for a request target
    • isAbsolutePathRelaxed

      public boolean isAbsolutePathRelaxed(int c)
      Checks if the given character is valid for a relaxed absolute path.
      Parameters:
      c - the character to check
      Returns:
      true if the character is valid for an absolute path
    • isQueryRelaxed

      public boolean isQueryRelaxed(int c)
      Checks if the given character is valid for a relaxed query string.
      Parameters:
      c - the character to check
      Returns:
      true if the character is valid for a query string
    • unquote

      public static String unquote(String input)
      Removes surrounding quotes from a string, handling escaped characters.
      Parameters:
      input - the string to unquote
      Returns:
      the unquoted string, or null if the input is invalid
    • isToken

      public static boolean isToken(int c)
      Checks if the given character is a valid HTTP token character as per RFC 7230.
      Parameters:
      c - the character to check
      Returns:
      true if the character is a valid token character
    • isToken

      public static boolean isToken(String s)
      Is the provided String a token as per RFC 7230?
      Note: token = 1 * tchar (RFC 7230)
      Since a token requires at least 1 tchar, null and the empty string ("") are not considered to be valid tokens.
      Parameters:
      s - The string to test
      Returns:
      true if the string is a valid token, otherwise false
    • isHex

      public static boolean isHex(int c)
      Checks if the given character is a valid hexadecimal digit.
      Parameters:
      c - the character to check
      Returns:
      true if the character is a valid hex digit
    • isNotRequestTarget

      public static boolean isNotRequestTarget(int c)
      Checks if the given character is not valid for a request target.
      Parameters:
      c - the character to check
      Returns:
      true if the character is not valid for a request target
    • isHttpProtocol

      public static boolean isHttpProtocol(int c)
      Checks if the given character is valid for an HTTP protocol version string.
      Parameters:
      c - the character to check
      Returns:
      true if the character is valid for an HTTP protocol version
    • isAlpha

      public static boolean isAlpha(int c)
      Checks if the given character is an alphabetic character.
      Parameters:
      c - the character to check
      Returns:
      true if the character is alphabetic
    • isNumeric

      public static boolean isNumeric(int c)
      Checks if the given character is a numeric digit.
      Parameters:
      c - the character to check
      Returns:
      true if the character is a numeric digit
    • isScheme

      public static boolean isScheme(int c)
      Checks if the given character is valid for a URI scheme as per RFC 3986.
      Parameters:
      c - the character to check
      Returns:
      true if the character is valid for a URI scheme
    • isScheme

      public static boolean isScheme(String s)
      Is the provided String a scheme as per RFC 3986?
      Note: scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
      Since a scheme requires at least 1 ALPHA, null and the empty string ("") are not considered to be valid tokens.
      Parameters:
      s - The string to test
      Returns:
      true if the string is a valid scheme, otherwise false
    • isUserInfo

      public static boolean isUserInfo(int c)
      Checks if the given character is valid for a URI userinfo component as per RFC 3986.
      Parameters:
      c - the character to check
      Returns:
      true if the character is valid for userinfo
    • isWhiteSpace

      public static boolean isWhiteSpace(int c)
      Checks if the given character is whitespace (tab or space).
      Parameters:
      c - the character to check
      Returns:
      true if the character is whitespace
    • isAbsolutePath

      public static boolean isAbsolutePath(int c)
      Checks if the given character is valid for an absolute path as per RFC 3986.
      Parameters:
      c - the character to check
      Returns:
      true if the character is valid for an absolute path
    • isQuery

      public static boolean isQuery(int c)
      Checks if the given character is valid for a query string as per RFC 3986.
      Parameters:
      c - the character to check
      Returns:
      true if the character is valid for a query string
    • isControl

      public static boolean isControl(int c)
      Checks if the given character is a control character.
      Parameters:
      c - the character to check
      Returns:
      true if the character is a control character
    • isFieldVChar

      public static boolean isFieldVChar(int c)
      Checks if the given character is a valid field-vchar as per RFC 7230.
      Parameters:
      c - the character to check
      Returns:
      true if the character is valid field-vchar
    • isFieldContent

      public static boolean isFieldContent(int c)
      Checks if the given character is valid field-content as per RFC 7230.
      Parameters:
      c - the character to check
      Returns:
      true if the character is valid field-content