Package org.w3c.tidy

Class TidyUtils


  • public final class TidyUtils
    extends java.lang.Object
    Utility class with handy methods, mainly for String handling or for reproducing c behaviours.
    Version:
    $Revision $ ($Author $)
    Author:
    Fabrizio Giustina
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static boolean findBadSubString​(java.lang.String s, java.lang.String p, int len)
      Return true if substring s is in p and isn't all in upper case.
      static char foldCase​(char c, boolean tocaps, boolean xmlTags)
      Fold case of a char.
      static byte[] getBytes​(java.lang.String str)
      Should always be able convert to/from UTF-8, so encoding exceptions are converted to an Error to avoid adding throws declarations in lots of methods.
      static java.lang.String getString​(byte[] bytes, int offset, int length)
      Should always be able convert to/from UTF-8, so encoding exceptions are converted to an Error to avoid adding throws declarations in lots of methods.
      static boolean isCharEncodingSupported​(java.lang.String name)
      Is the given character encoding supported?
      static boolean isDigit​(char c)
      Is the given char a digit?
      static boolean isLetter​(char c)
      Is the given char a letter?
      static boolean isLower​(char c)
      Determines if the specified character is a lowercase character.
      static boolean isNamechar​(char c)
      Is the given char valid in name? (letter, digit or "-", ".", ":", "_")
      static boolean isUpper​(char c)
      Determines if the specified character is a uppercase character.
      static boolean isWhite​(char c)
      Determines if the specified character is whitespace.
      static int lastChar​(java.lang.String str)
      Return the last char in string.
      static char toLower​(char c)
      Maps the given character to its lowercase equivalent.
      static char toUpper​(char c)
      Maps the given character to its uppercase equivalent.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • findBadSubString

        public static boolean findBadSubString​(java.lang.String s,
                                               java.lang.String p,
                                               int len)
        Return true if substring s is in p and isn't all in upper case. This is used to check the case of SYSTEM, PUBLIC, DTD and EN.
        Parameters:
        s - substring
        p - full string
        len - how many chars to check in p
        Returns:
        true if substring s is in p and isn't all in upper case
      • getBytes

        public static byte[] getBytes​(java.lang.String str)
        Should always be able convert to/from UTF-8, so encoding exceptions are converted to an Error to avoid adding throws declarations in lots of methods.
        Parameters:
        str - String
        Returns:
        utf8 bytes
        See Also:
        String.getBytes()
      • getString

        public static java.lang.String getString​(byte[] bytes,
                                                 int offset,
                                                 int length)
        Should always be able convert to/from UTF-8, so encoding exceptions are converted to an Error to avoid adding throws declarations in lots of methods.
        Parameters:
        bytes - byte array
        offset - starting offset in byte array
        length - length in byte array starting from offset
        Returns:
        same as new String(bytes, offset, length, "UTF8")
      • lastChar

        public static int lastChar​(java.lang.String str)
        Return the last char in string. This is useful when trailing quotemark is missing on an attribute
        Parameters:
        str - String
        Returns:
        last char in String
      • isWhite

        public static boolean isWhite​(char c)
        Determines if the specified character is whitespace.
        Parameters:
        c - char
        Returns:
        true if char is whitespace.
      • isDigit

        public static boolean isDigit​(char c)
        Is the given char a digit?
        Parameters:
        c - char
        Returns:
        true if the given char is a digit
      • isLetter

        public static boolean isLetter​(char c)
        Is the given char a letter?
        Parameters:
        c - char
        Returns:
        true if the given char is a letter
      • isNamechar

        public static boolean isNamechar​(char c)
        Is the given char valid in name? (letter, digit or "-", ".", ":", "_")
        Parameters:
        c - char
        Returns:
        true if char is a name char.
      • isLower

        public static boolean isLower​(char c)
        Determines if the specified character is a lowercase character.
        Parameters:
        c - char
        Returns:
        true if char is lower case.
      • isUpper

        public static boolean isUpper​(char c)
        Determines if the specified character is a uppercase character.
        Parameters:
        c - char
        Returns:
        true if char is upper case.
      • toLower

        public static char toLower​(char c)
        Maps the given character to its lowercase equivalent.
        Parameters:
        c - char
        Returns:
        lowercase char.
      • toUpper

        public static char toUpper​(char c)
        Maps the given character to its uppercase equivalent.
        Parameters:
        c - char
        Returns:
        uppercase char.
      • foldCase

        public static char foldCase​(char c,
                                    boolean tocaps,
                                    boolean xmlTags)
        Fold case of a char.
        Parameters:
        c - char
        tocaps - convert to caps
        xmlTags - use xml tags? If true no change will be performed
        Returns:
        folded char
        To do:
        check the use of xmlTags parameter
      • isCharEncodingSupported

        public static boolean isCharEncodingSupported​(java.lang.String name)
        Is the given character encoding supported?
        Parameters:
        name - character encoding name
        Returns:
        true if encoding is supported, false otherwhise.