A HTTP message consists of a header and an optional body. The message header of an HTTP request consists of a request line and a collection of header fields. The message header of an HTTP response consists of a status line and a collection of header fields. All HTTP messages must include the protocol version. Some HTTP messages can optionally enclose a content body.
HttpCore defines the HTTP message object model to follow this definition closely, and provides extensive support for serialization (formatting) and deserialization (parsing) of HTTP message elements.
HTTP request is a message sent from the client to the server. The first line of that message includes the method to apply to the resource, the identifier of the resource, and the protocol version in use.
HttpRequest request = new BasicHttpRequest("GET", "/", HttpVersion.HTTP_1_1); System.out.println(request.getRequestLine().getMethod()); System.out.println(request.getRequestLine().getUri()); System.out.println(request.getProtocolVersion()); System.out.println(request.getRequestLine().toString());
stdout >
GET / HTTP/1.1 GET / HTTP/1.1
HTTP response is a message sent by the server back to the client after having received and interpreted a request message. The first line of that message consists of the protocol version followed by a numeric status code and its associated textual phrase.
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1, HttpStatus.SC_OK, "OK"); System.out.println(response.getProtocolVersion()); System.out.println(response.getStatusLine().getStatusCode()); System.out.println(response.getStatusLine().getReasonPhrase()); System.out.println(response.getStatusLine().toString());
stdout >
HTTP/1.1 200 OK HTTP/1.1 200 OK
An HTTP message can contain a number of headers describing properties of the message such as the content length, content type, and so on. HttpCore provides methods to retrieve, add, remove, and enumerate such headers.
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1, HttpStatus.SC_OK, "OK"); response.addHeader("Set-Cookie", "c1=a; path=/; domain=localhost"); response.addHeader("Set-Cookie", "c2=b; path=\"/\", c3=c; domain=\"localhost\""); Header h1 = response.getFirstHeader("Set-Cookie"); System.out.println(h1); Header h2 = response.getLastHeader("Set-Cookie"); System.out.println(h2); Header[] hs = response.getHeaders("Set-Cookie"); System.out.println(hs.length);
stdout >
Set-Cookie: c1=a; path=/; domain=localhost Set-Cookie: c2=b; path="/", c3=c; domain="localhost" 2
There is an efficient way to obtain all headers of a given type using the
HeaderIterator
interface.
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1, HttpStatus.SC_OK, "OK"); response.addHeader("Set-Cookie", "c1=a; path=/; domain=localhost"); response.addHeader("Set-Cookie", "c2=b; path=\"/\", c3=c; domain=\"localhost\""); HeaderIterator it = response.headerIterator("Set-Cookie"); while (it.hasNext()) { System.out.println(it.next()); }
stdout >
Set-Cookie: c1=a; path=/; domain=localhost Set-Cookie: c2=b; path="/", c3=c; domain="localhost"
It also provides convenience methods to parse HTTP messages into individual header elements.
HttpResponse response = new BasicHttpResponse(HttpVersion.HTTP_1_1, HttpStatus.SC_OK, "OK"); response.addHeader("Set-Cookie", "c1=a; path=/; domain=localhost"); response.addHeader("Set-Cookie", "c2=b; path=\"/\", c3=c; domain=\"localhost\""); HeaderElementIterator it = new BasicHeaderElementIterator( response.headerIterator("Set-Cookie")); while (it.hasNext()) { HeaderElement elem = it.nextElement(); System.out.println(elem.getName() + " = " + elem.getValue()); NameValuePair[] params = elem.getParameters(); for (int i = 0; i < params.length; i++) { System.out.println(" " + params[i]); } }
stdout >
c1 = a path=/ domain=localhost c2 = b path=/ c3 = c domain=localhost
HTTP headers are tokenized into individual header elements only on demand. HTTP headers received over an HTTP connection are stored internally as an array of characters and parsed lazily only when you access their properties.
HTTP messages can carry a content entity associated with the request or response. Entities can be found in some requests and in some responses, as they are optional. Requests that use entities are referred to as entity-enclosing requests. The HTTP specification defines two entity-enclosing methods: POST and PUT. Responses are usually expected to enclose a content entity. There are exceptions to this rule such as responses to HEAD method and 204 No Content, 304 Not Modified, 205 Reset Content responses.
HttpCore distinguishes three kinds of entities, depending on where their content originates:
streamed: The content is received from a stream, or generated on the fly. In particular, this category includes entities being received from a connection. Streamed entities are generally not repeatable.
self-contained: The content is in memory or obtained by means that are independent from a connection or other entity. Self-contained entities are generally repeatable.
wrapping: The content is obtained from another entity.
An entity can be repeatable, meaning its content can be read more than once. This
is only possible with self-contained entities (like
ByteArrayEntity
or StringEntity
).
Since an entity can represent both binary and character content, it has support for character encodings (to support the latter, i.e. character content).
The entity is created when executing a request with enclosed content or when the request was successful and the response body is used to send the result back to the client.
To read the content from the entity, one can either retrieve the input stream via
the HttpEntity#getContent()
method, which returns an
java.io.InputStream
, or one can supply an output stream to
the HttpEntity#writeTo(OutputStream)
method, which will
return once all content has been written to the given stream. Please note that
some non-streaming (self-contained) entities may be unable to represent their
content as a java.io.InputStream
efficiently. It is legal
for such entities to implement HttpEntity#writeTo(OutputStream)
method only and to throw UnsupportedOperationException
from HttpEntity#getContent()
method.
The EntityUtils
class exposes several static methods to simplify
extracting the content or information from an entity. Instead of reading
the java.io.InputStream
directly, one can retrieve the complete
content body in a string or byte array by using the methods from this class.
When the entity has been received with an incoming message, the methods
HttpEntity#getContentType()
and
HttpEntity#getContentLength()
methods can be used for
reading the common metadata such as Content-Type
and
Content-Length
headers (if they are available). Since the
Content-Type
header can contain a character encoding for text
mime-types like text/plain
or text/html
,
the HttpEntity#getContentEncoding()
method is used to
read this information. If the headers aren't available, a length of -1 will be
returned, and NULL
for the content type. If the
Content-Type
header is available, a Header object will be
returned.
When creating an entity for a outgoing message, this meta data has to be supplied by the creator of the entity.
StringEntity myEntity = new StringEntity("important message", Consts.UTF_8); System.out.println(myEntity.getContentType()); System.out.println(myEntity.getContentLength()); System.out.println(EntityUtils.toString(myEntity)); System.out.println(EntityUtils.toByteArray(myEntity).length);
stdout >
Content-Type: text/plain; charset=UTF-8 17 important message 17
In order to ensure proper release of system resources one must close the content stream associated with the entity.
HttpResponse response; HttpEntity entity = response.getEntity(); if (entity != null) { InputStream instream = entity.getContent(); try { // do something useful } finally { instream.close(); } }
When working with streaming entities, one can use the
EntityUtils#consume(HttpEntity)
method to ensure that
the entity content has been fully consumed and the underlying stream has been
closed.
There are a few ways to create entities. HttpCore provides the following implementations:
Exactly as the name implies, this basic entity represents an underlying stream. In general, use this class for entities received from HTTP messages.
This entity has an empty constructor. After construction, it represents no content, and has a negative content length.
One needs to set the content stream, and optionally the length. This can be done
with the BasicHttpEntity#setContent(InputStream)
and
BasicHttpEntity#setContentLength(long)
methods
respectively.
BasicHttpEntity myEntity = new BasicHttpEntity(); myEntity.setContent(someInputStream); myEntity.setContentLength(340); // sets the length to 340
ByteArrayEntity
is a self-contained, repeatable entity
that obtains its content from a given byte array. Supply the byte array to the
constructor.
ByteArrayEntity myEntity = new ByteArrayEntity(new byte[] {1,2,3}, ContentType.APPLICATION_OCTET_STREAM);
StringEntity
is a self-contained, repeatable entity that
obtains its content from a java.lang.String
object. It has
three constructors, one simply constructs with a given java.lang.String
object; the second also takes a character encoding for the data in the
string; the third allows the mime type to be specified.
StringBuilder sb = new StringBuilder(); Map<String, String> env = System.getenv(); for (Map.Entry<String, String> envEntry : env.entrySet()) { sb.append(envEntry.getKey()) .append(": ").append(envEntry.getValue()) .append("\r\n"); } // construct without a character encoding (defaults to ISO-8859-1) HttpEntity myEntity1 = new StringEntity(sb.toString()); // alternatively construct with an encoding (mime type defaults to "text/plain") HttpEntity myEntity2 = new StringEntity(sb.toString(), Consts.UTF_8); // alternatively construct with an encoding and a mime type HttpEntity myEntity3 = new StringEntity(sb.toString(), ContentType.create("text/plain", Consts.UTF_8));
InputStreamEntity
is a streamed, non-repeatable entity that
obtains its content from an input stream. Construct it by supplying the input
stream and the content length. Use the content length to limit the amount of data
read from the java.io.InputStream
. If the length matches
the content length available on the input stream, then all data will be sent.
Alternatively, a negative content length will read all data from the input stream,
which is the same as supplying the exact content length, so use the length to limit
the amount of data to read.
InputStream instream = getSomeInputStream(); InputStreamEntity myEntity = new InputStreamEntity(instream, 16);
FileEntity
is a self-contained, repeatable entity that
obtains its content from a file. Use this mostly to stream large files of different
types, where you need to supply the content type of the file, for
instance, sending a zip file would require the content type
application/zip
, for XML application/xml
.
HttpEntity entity = new FileEntity(staticFile, ContentType.create("application/java-archive"));
This is the base class for creating wrapped entities. The wrapping entity holds a reference to a wrapped entity and delegates all calls to it. Implementations of wrapping entities can derive from this class and need to override only those methods that should not be delegated to the wrapped entity.
BufferedHttpEntity
is a subclass of
HttpEntityWrapper
. Construct it by supplying another entity. It
reads the content from the supplied entity, and buffers it in memory.
This makes it possible to make a repeatable entity, from a non-repeatable entity. If the supplied entity is already repeatable, it simply passes calls through to the underlying entity.
myNonRepeatableEntity.setContent(someInputStream); BufferedHttpEntity myBufferedEntity = new BufferedHttpEntity( myNonRepeatableEntity);
HTTP protocol interceptor is a routine that implements a specific aspect of the HTTP protocol. Usually protocol interceptors are expected to act upon one specific header or a group of related headers of the incoming message or populate the outgoing message with one specific header or a group of related headers. Protocol interceptors can also manipulate content entities enclosed with messages; transparent content compression / decompression being a good example. Usually this is accomplished by using the 'Decorator' pattern where a wrapper entity class is used to decorate the original entity. Several protocol interceptors can be combined to form one logical unit.
HTTP protocol processor is a collection of protocol interceptors that implements the 'Chain of Responsibility' pattern, where each individual protocol interceptor is expected to work on the particular aspect of the HTTP protocol it is responsible for.
Usually the order in which interceptors are executed should not matter as long as they do not depend on a particular state of the execution context. If protocol interceptors have interdependencies and therefore must be executed in a particular order, they should be added to the protocol processor in the same sequence as their expected execution order.
Protocol interceptors must be implemented as thread-safe. Similarly to servlets, protocol interceptors should not use instance variables unless access to those variables is synchronized.
HttpCore comes with a number of most essential protocol interceptors for client and server HTTP processing.
RequestContent
is the most important interceptor for
outgoing requests. It is responsible for delimiting content length by adding
the Content-Length
or Transfer-Content
headers
based on the properties of the enclosed entity and the protocol version. This
interceptor is required for correct functioning of client side protocol processors.
ResponseContent
is the most important interceptor for
outgoing responses. It is responsible for delimiting content length by adding
Content-Length
or Transfer-Content
headers
based on the properties of the enclosed entity and the protocol version. This
interceptor is required for correct functioning of server side protocol processors.
RequestConnControl
is responsible for adding the
Connection
header to the outgoing requests, which is essential
for managing persistence of HTTP/1.0
connections. This
interceptor is recommended for client side protocol processors.
ResponseConnControl
is responsible for adding
the Connection
header to the outgoing responses, which is essential
for managing persistence of HTTP/1.0
connections. This
interceptor is recommended for server side protocol processors.
RequestDate
is responsible for adding the
Date
header to the outgoing requests. This interceptor is
optional for client side protocol processors.
ResponseDate
is responsible for adding the
Date
header to the outgoing responses. This interceptor is
recommended for server side protocol processors.
RequestExpectContinue
is responsible for enabling the
'expect-continue' handshake by adding the Expect
header. This
interceptor is recommended for client side protocol processors.
RequestTargetHost
is responsible for adding the
Host
header. This interceptor is required for client side
protocol processors.
RequestUserAgent
is responsible for adding the
User-Agent
header. This interceptor is recommended for client
side protocol processors.
Usually HTTP protocol processors are used to pre-process incoming messages prior to executing application specific processing logic and to post-process outgoing messages.
HttpProcessor httpproc = HttpProcessorBuilder.create() // Required protocol interceptors .add(new RequestContent()) .add(new RequestTargetHost()) // Recommended protocol interceptors .add(new RequestConnControl()) .add(new RequestUserAgent("MyAgent-HTTP/1.1")) // Optional protocol interceptors .add(new RequestExpectContinue(true)) .build(); HttpCoreContext context = HttpCoreContext.create(); HttpRequest request = new BasicHttpRequest("GET", "/"); httpproc.process(request, context);
Send the request to the target host and get a response.
HttpResponse = <...> httpproc.process(response, context);
Please note the BasicHttpProcessor
class does not synchronize
access to its internal structures and therefore may not be thread-safe.
Originally HTTP has been designed as a stateless, response-request oriented protocol.
However, real world applications often need to be able to persist state information
through several logically related request-response exchanges. In order to enable
applications to maintain a processing state HttpCpre allows HTTP messages to be
executed within a particular execution context, referred to as HTTP context. Multiple
logically related messages can participate in a logical session if the same context is
reused between consecutive requests. HTTP context functions similarly to
a java.util.Map<String, Object>
. It is
simply a collection of logically related named values.
Please nore HttpContext
can contain arbitrary objects
and therefore may be unsafe to share between multiple threads. Care must be taken
to ensure that HttpContext
instances can be
accessed by one thread at a time.
Protocol interceptors can collaborate by sharing information - such as a processing
state - through an HTTP execution context. HTTP context is a structure that can be
used to map an attribute name to an attribute value. Internally HTTP context
implementations are usually backed by a HashMap
. The primary
purpose of the HTTP context is to facilitate information sharing among various
logically related components. HTTP context can be used to store a processing state for
one message or several consecutive messages. Multiple logically related messages can
participate in a logical session if the same context is reused between consecutive
messages.
HttpProcessor httpproc = HttpProcessorBuilder.create() .add(new HttpRequestInterceptor() { public void process( HttpRequest request, HttpContext context) throws HttpException, IOException { String id = (String) context.getAttribute("session-id"); if (id != null) { request.addHeader("Session-ID", id); } } }) .build(); HttpCoreContext context = HttpCoreContext.create(); HttpRequest request = new BasicHttpRequest("GET", "/"); httpproc.process(request, context);