Home Tutorials Training Consulting Books Company Contact us


Get more...

Java Networking. This article describes how to use the java.net package to access the web resources. The setting of a proxy is also described.

1. Java and HTTP access

Java provides a HTTP client API to access resources via the HTTP or HTTPS protocol. The main classes to access the Internet are the java.net.URL class and the java.net.HttpURLConnection class.

The URL class can be used to define a pointer to a web resource while the HttpURLConnection class can be used to access a web resource.

HttpURLConnection allows you to create an InputStream.

Once you have accessed an InputStream you can read it similarly to an InputStream from a local file.

HttpURLConnection supports the transparent response compression (via the header Accept-Encoding: gzip, Server Name Indication (extension of SSL and TLS) and a response cache.

The API is relatively straight forward. For example to retrieve the webpage www.vogella.com you can use the following example.

package com.vogella.java.introduction;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;

public class DownloadWebpageExample {
    private static final String newLine  = System.getProperty("line.separator");
    public static void main(String[] args) {
        try  {
            URL url = new URL("https://www.vogella.com/");
            HttpURLConnection con = (HttpURLConnection) url.openConnection();
            String readStream = readStream(con.getInputStream());
            // Give output for the command line
            System.out.println(readStream);
        } catch (Exception e) {
            e.printStackTrace();
        }

    }

    private static String readStream(InputStream in) {
        StringBuilder sb = new StringBuilder();
        try (BufferedReader reader = new BufferedReader(new InputStreamReader(in));) {
            String nextLine = "";
            while ((nextLine = reader.readLine()) != null) {
                sb.append(nextLine + newLine);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
        return sb.toString();
    }
}

The Javadoc of HttpURLConnection suggest to not reuse an instance of HttpURLConnection. If you use it this way, HttpURLConnection has no threading issues, as it will not be shared between different Threads.

Alternatively you can also use the scranner class.

package com.vogella.java.introduction;

import java.io.IOException;
import java.io.InputStream;
import java.net.URL;
import java.util.Scanner;

public class DowloadWebpageExampleScanner {
    public static void main(String[] args) {
        try (
                InputStream openStream = new URL("https://www.vogella.com/").openStream();
                Scanner scanner = new Scanner(openStream, "UTF-8");) {
            String out = scanner.useDelimiter("\\A").next();
            System.out.println(out);
        } catch (IOException e) {
            e.printStackTrace();
        }

    }
}

2. Example: Read web page via Java

Create a Java project called de.vogella.web.html. The following code will read an HTML page from a URL and write the result to the console.

package de.vogella.web.html;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;

public class ReadWebPage {
  public static void main(String[] args) {
    String urlText = "https://www.vogella.com/";
    BufferedReader in = null;
    try {
      URL url = new URL(urlText);
      in = new BufferedReader(new InputStreamReader(url.openStream()));

      String inputLine;
      while ((inputLine = in.readLine()) != null) {
        System.out.println(inputLine);
      }
    } catch (Exception e) {
      e.printStackTrace();
    } finally {
      if (in != null) {
        try {
          in.close();
        } catch (IOException e) {
          e.printStackTrace();
        }
      }
    }
  }
}

3. Getting the return code from a webpage

HTML return codes are standardized codes which a web server returns if a certain situation has occurred. For example the return code "200" means the HTML request is ok and the server will perform the require action, e.g. serving the webpage.

The following code will access web page and print the return code for the HTML access.

The most important HTML return codes are:

Table 1. HTML return codes
Return Code Explaination

200

Ok

301

Permanent redirect to another webpage

400

Bad request

404

Not found

package de.vogella.web.html;

import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.URL;

public class ReadReturnCode {
  public static void main(String[] args) throws IOException {
    String urltext = "https://www.vogella.com/";
    URL url = new URL(urltext);
    int responseCode = ((HttpURLConnection) url.openConnection())
        .getResponseCode();
    System.out.println(responseCode);
  }
}

4. Content Type / MIME Type

The Internet media type (short MIME) which is also called Content-type define the type of the web resource. The MIME type is a two-part identifier for file formats on the Internet. For html page the content-type is "text/html".

The following code will check for the return code of a URL and will get the content-type (MIME-Typ) for the web resource.

package de.vogella.web.html;

import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.URL;

public class ReadMimeType {
  public static void main(String[] args) throws IOException {
    String urltext = "https://www.vogella.com/";
    URL url = new URL(urltext);
    String contentType = ((HttpURLConnection) url.openConnection())
        .getContentType();
    System.out.println(contentType);
  }
}

5. Using Http get services

Several websites offer services via Http get calls. For example your can send a get request to "http://tinyurl" or http://tr.im" and receive a short version of the Url you pass as parameter. The following will demonstrate how to call the get service from "http://TinyUrl" or "http://tr.im" via Java. Create the Java project "de.vogella.web.get" and create the following classes which will call a getService and return the result.

package de.vogella.web.get;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;

public class TinyURL  {
  private static final String tinyUrl = "http://tinyurl.com/api-create.php?url=";
  
  public String shorter(String url) throws IOException {
    String tinyUrlLookup = tinyUrl + url;
    BufferedReader reader = new BufferedReader(new InputStreamReader(new URL(tinyUrlLookup).openStream()));
    String tinyUrl = reader.readLine();
    return tinyUrl;
  }
  
  
  
}
package de.vogella.web.get;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;

public class Trim {
  private static final String trimUrl = "http://api.tr.im/v1/trim_simple?url=";
  
  public String shorter(String url) throws IOException {
    String tinyUrlLookup = trimUrl + url;
    BufferedReader reader = new BufferedReader(new InputStreamReader(new URL(tinyUrlLookup).openStream()));
    String tinyUrl = reader.readLine();
    return tinyUrl;
  }
}

And a little test.

package de.vogella.web.get;

import java.io.IOException;

public class Test {

  /**
   * @param args
   * @throws IOException 
   */
  public static void main(String[] args) throws IOException {
    String s = "https://www.vogella.com/";
    TinyURL tiny = new TinyURL();
    System.out.println(tiny.shorter(s));
    Trim trim= new Trim ();
    System.out.println(trim.shorter(s));
  }

}

6. Proxy

You can define a proxy at startup via a start parameter.

java  -Dhttp.proxyHost=proxy  -Dhttp.proxyPort=8080 JavaProgram

In your code you can set a proxy via System.setProperty. For example if your proxy is called proxy and runs on port "8080" the following code will set the proxy.

System.setProperty("http.proxySet", "true");
System.setProperty("http.proxyHost", "proxy");
System.setProperty("http.proxyPort", "8080");

7. Resources