Class ParsedURL

java.lang.Object
org.apache.batik.util.ParsedURL

public class ParsedURL extends Object
A URL-like class that supports custom URI schemes and GZIP encoding.

This class is used as a replacement for URL. This is done for several reasons. First, unlike URL this class will accept and parse as much of a URL as possible, without throwing a MalformedURLException. This makes it useful for simply parsing a URL string (hence its name).

Second, it allows for extension of the URI schemes supported by the parser. Batik uses this to support the data: URL scheme (RFC2397).

Third, by default it checks the streams that it opens to see if they are GZIP compressed, and if so it automatically uncompresses them (avoiding opening the stream twice in the process).

It is worth noting that most real work is defered to the ParsedURLData class to which most methods are forwarded. This is done because it allows a constructor interface to ParsedURL (mostly for compatability with core URL), in spite of the fact that the real implemenation uses the protocol handlers as factories for protocol specific instances of the ParsedURLData class.

Version:
$Id: ParsedURL.java 1804130 2017-08-04 14:41:11Z ssteiner $
  • Field Details

    • data

      The data class we defer most things to.
    • userAgent

      String userAgent
      The user agent to associate with this URL
    • handlersMap

      private static Map handlersMap
      This maps between protocol names and ParsedURLProtocolHandler instances.
    • defaultHandler

      private static ParsedURLProtocolHandler defaultHandler
      The default protocol handler. This handler is used when other handlers fail or no match for a protocol can be found.
    • globalUserAgent

      private static String globalUserAgent
  • Constructor Details

    • ParsedURL

      public ParsedURL(String urlStr)
      Construct a ParsedURL from the given url string.
      Parameters:
      urlStr - The string to try and parse as a URL
    • ParsedURL

      public ParsedURL(URL url)
      Construct a ParsedURL from the given java.net.URL instance. This is useful if you already have a valid java.net.URL instance. This bypasses most of the parsing and hence is quicker and less prone to reinterpretation than converting the URL to a string before construction.
      Parameters:
      url - The URL to "mimic".
    • ParsedURL

      public ParsedURL(String baseStr, String urlStr)
      Construct a sub URL from two strings.
      Parameters:
      baseStr - The 'parent' URL. Should be complete.
      urlStr - The 'sub' URL may be complete or partial. the missing pieces will be taken from the baseStr.
    • ParsedURL

      public ParsedURL(URL baseURL, String urlStr)
      Construct a sub URL from a base URL and a string for the sub url.
      Parameters:
      baseURL - The 'parent' URL.
      urlStr - The 'sub' URL may be complete or partial. the missing pieces will be taken from the baseURL.
    • ParsedURL

      public ParsedURL(ParsedURL baseURL, String urlStr)
      Construct a sub URL from a base ParsedURL and a string for the sub url.
      Parameters:
      baseURL - The 'parent' URL.
      urlStr - The 'sub' URL may be complete or partial. the missing pieces will be taken from the baseURL.
  • Method Details

    • getGlobalUserAgent

      public static String getGlobalUserAgent()
    • setGlobalUserAgent

      public static void setGlobalUserAgent(String userAgent)
    • getHandlersMap

      private static Map getHandlersMap()
      Returns the shared instance of HandlersMap. This method is also responsible for initializing the handler map if this is the first time it has been requested since the class was loaded.
    • getHandler

      public static ParsedURLProtocolHandler getHandler(String protocol)
      Returns the handler for a particular protocol. If protocol is null or no match is found in the handlers map it returns the default protocol handler.
      Parameters:
      protocol - The protocol to get a handler for.
    • registerHandler

      public static void registerHandler(ParsedURLProtocolHandler handler)
      Registers a Protocol handler by adding it to the handlers map. If the given protocol handler returns null as it's supported protocol then it is registered as the default protocol handler.
      Parameters:
      handler - the new Protocol Handler to register
    • checkGZIP

      public static InputStream checkGZIP(InputStream is) throws IOException
      This is a utility function others can call that checks if is is a GZIP stream if so it returns a GZIPInputStream that will decode the contents, otherwise it returns (or a buffered version of is) untouched.
      Parameters:
      is - Stream that may potentially be a GZIP stream.
      Throws:
      IOException
    • toString

      public String toString()
      Return a string rep of the URL (can be passed back into the constructor if desired).
      Overrides:
      toString in class Object
    • getPostConnectionURL

      public String getPostConnectionURL()
      Returns the URL that was ultimately used to fetch the resource represented by this ParsedURL. For HTTP URLs, this will result in the post-redirect URL being returned. If there was no redirect, or if this isn't an HTTP URL, the original URL is returned (the same string as toString()).
    • equals

      public boolean equals(Object obj)
      Implement Object.equals. Relies heavily on the contained ParsedURLData's implementation of equals.
      Overrides:
      equals in class Object
    • hashCode

      public int hashCode()
      Implement Object.hashCode. Relies on the contained ParsedURLData's implementation of hashCode.
      Overrides:
      hashCode in class Object
    • complete

      public boolean complete()
      Returns true if the URL looks well formed and complete. This does not guarantee that the stream can be opened but is a good indication that things aren't totally messed up.
    • getUserAgent

      public String getUserAgent()
      Return the user agent current associated with this url (or null if none).
    • setUserAgent

      public void setUserAgent(String userAgent)
      Sets the user agent associated with this url (null clears any associated user agent).
    • getProtocol

      public String getProtocol()
      Returns the protocol for this URL. The protocol is everything upto the first ':'.
    • getHost

      public String getHost()
      Returns the host for this URL, if any, null if there isn't one or it doesn't make sense for the protocol.
    • getPort

      public int getPort()
      Returns the port on the host to connect to, if it was specified in the url that was parsed, otherwise returns -1.
    • getPath

      public String getPath()
      Returns the path for this URL, if any (where appropriate for the protocol this also includes the file, not just directory). Note that getPath appears in JDK 1.3 as a synonym for getFile from JDK 1.2.
    • getRef

      public String getRef()
      Returns the 'fragment' reference in the URL.
    • getPortStr

      public String getPortStr()
      Returns the URL up to and include the port number on the host. Does not include the path or fragment pieces.
    • getContentType

      public String getContentType()
      Returns the content type if available. This is only available for some protocols.
    • getContentTypeMediaType

      public String getContentTypeMediaType()
      Returns the content type's type/subtype, if available. This is only available for some protocols.
    • getContentTypeCharset

      public String getContentTypeCharset()
      Returns the content type's charset parameter, if available. This is only available for some protocols.
    • hasContentTypeParameter

      public boolean hasContentTypeParameter(String param)
      Returns whether the Content-Type header has the given parameter.
    • getContentEncoding

      public String getContentEncoding()
      Returns the content encoding if available. This is only available for some protocols.
    • openStream

      public InputStream openStream() throws IOException
      Attempt to open the stream checking for common compression types, and automatically decompressing them if found.
      Throws:
      IOException
    • openStream

      public InputStream openStream(String mimeType) throws IOException
      Attempt to open the stream checking for common compression types, and automatically decompressing them if found.
      Parameters:
      mimeType - The expected mime type of the content in the returned InputStream (mapped to Http accept header among other possibilities).
      Throws:
      IOException
    • openStream

      public InputStream openStream(String[] mimeTypes) throws IOException
      Attempt to open the stream checking for common compression types, and automatically decompressing them if found.
      Parameters:
      mimeTypes - The expected mime types of the content in the returned InputStream (mapped to Http accept header among other possabilities).
      Throws:
      IOException
    • openStream

      public InputStream openStream(Iterator mimeTypes) throws IOException
      Attempt to open the stream checking for common compression types, and automatically decompressing them if found.
      Parameters:
      mimeTypes - The expected mime types of the content in the returned InputStream (mapped to Http accept header among other possabilities). The elements of the iterator must be strings.
      Throws:
      IOException
    • openStreamRaw

      public InputStream openStreamRaw() throws IOException
      Attempt to open the stream, does no checking for compression types.
      Throws:
      IOException
    • openStreamRaw

      public InputStream openStreamRaw(String mimeType) throws IOException
      Attempt to open the stream, does no checking for compression types.
      Parameters:
      mimeType - The expected mime type of the content in the returned InputStream (mapped to Http accept header among other possabilities).
      Throws:
      IOException
    • openStreamRaw

      public InputStream openStreamRaw(String[] mimeTypes) throws IOException
      Attempt to open the stream, does no checking for comression types.
      Parameters:
      mimeTypes - The expected mime types of the content in the returned InputStream (mapped to Http accept header among other possabilities).
      Throws:
      IOException
    • openStreamRaw

      public InputStream openStreamRaw(Iterator mimeTypes) throws IOException
      Attempt to open the stream, does no checking for comression types.
      Parameters:
      mimeTypes - The expected mime types of the content in the returned InputStream (mapped to Http accept header among other possabilities). The elements of the iterator must be strings.
      Throws:
      IOException
    • sameFile

      public boolean sameFile(ParsedURL other)
    • getProtocol

      protected static String getProtocol(String urlStr)
      Parse out the protocol from a url string. Used internally to select the proper handler, all other parsing is done by the selected protocol handler.
    • parseURL

      public static ParsedURLData parseURL(String urlStr)
      Factory method to construct an appropriate subclass of ParsedURLData
      Parameters:
      urlStr - the string to parse.
    • parseURL

      public static ParsedURLData parseURL(String baseStr, String urlStr)
      Factory method to construct an appropriate subclass of ParsedURLData, for a sub url.
      Parameters:
      baseStr - The base URL string to parse.
      urlStr - the sub URL string to parse.
    • parseURL

      public static ParsedURLData parseURL(ParsedURL baseURL, String urlStr)
      Factory method to construct an appropriate subclass of ParsedURLData, for a sub url.
      Parameters:
      baseURL - The base ParsedURL to parse.
      urlStr - the sub URL string to parse.