【问题标题】:How do I get the body from this example HTTP request?如何从此示例 HTTP 请求中获取正文?
【发布时间】:2016-06-23 09:55:20
【问题描述】:

我正在尝试找到用 Java 解析 RFC-822 文档的最简单方法。假设我有一个存储 HTTP 消息的消息队列。请求和响应。因此,通过与端口 80 建立套接字连接并从那里发送/检索消息,它们不会以“正常”方式检索。

在下面的代码中,我故意将“邮件”标头与 HTTP 消息混合在一起。这是为了证明两者并没有太大的不同。但这无关紧要。代码如下:

package httpexample;

import java.io.ByteArrayInputStream;
import java.io.IOException;
import org.apache.http.Header;
import org.apache.http.HttpException;
import org.apache.http.HttpRequest;
import org.apache.http.impl.io.DefaultHttpRequestParser;
import org.apache.http.impl.io.HttpTransportMetricsImpl;
import org.apache.http.impl.io.SessionInputBufferImpl;
import org.apache.http.io.HttpMessageParser;
import org.apache.http.message.BasicHttpEntityEnclosingRequest;

public class HttpExample {

    // RFC 822

    public static void main(String[] args) throws IOException, HttpException {
        String str = "POST http://localhost:8080/foobar/1234567 HTTP/1.1\n" +
            "Message-ID: <19815303.1075861029555.JavaMail.ss@kk>\n" +
            "Date: Wed, 6 Mar 2010 12:32:20 -0800 (PST)\n" +
            "From: someone@someotherplace.com\n" +
            "To: someone@someplace.com\n" +
            "Subject: some subject\n" +
            "Mime-Version: 1.0\n" +
            "Content-Type: text/plain; charset=us-ascii\n" +
            "Content-Transfer-Encoding: 7bit\n" +
            "X-From: one, some <some.one@someotherplace.com>\n" +
            "X-To: one\n" +
            "X-cc: \n" +
            "X-bcc: \n" +
            "X-Origin: Bob-R\n" +
            "X-FileName: rbob (Non-Privileged).pst\n" +
            "\n" +
            "some message\n";
        ByteArrayInputStream fakeStream = new ByteArrayInputStream(
                str.getBytes());
        HttpTransportMetricsImpl metrics = new HttpTransportMetricsImpl();
        SessionInputBufferImpl inbuffer = new SessionInputBufferImpl(metrics, 1024);

        inbuffer.bind(fakeStream);
        HttpMessageParser<HttpRequest> requestParser =
                new DefaultHttpRequestParser(inbuffer);
        BasicHttpEntityEnclosingRequest request = (BasicHttpEntityEnclosingRequest)requestParser.parse();

        for (Header hdr : request.getAllHeaders()) {
            System.out.println(String.format("%-30s = %s", hdr.getName(), hdr.getValue()));
        }
        System.out.println(String.format("Request Line: %s", request.getRequestLine()));
        System.out.println(String.format("Body\n------------------\n%s",
                request.getEntity()));
    }

}

输出如下:

Message-ID                     = <19815303.1075861029555.JavaMail.ss@kk>
Date                           = Wed, 6 Mar 2010 12:32:20 -0800 (PST)
From                           = someone@someotherplace.com
To                             = someone@someplace.com
Subject                        = some subject
Mime-Version                   = 1.0
Content-Type                   = text/plain; charset=us-ascii
Content-Transfer-Encoding      = 7bit
X-From                         = one, some <some.one@someotherplace.com>
X-To                           = one
X-cc                           = 
X-bcc                          = 
X-Origin                       = Bob-R
X-FileName                     = rbob (Non-Privileged).pst
Request Line: POST http://localhost:8080/foobar/1234567 HTTP/1.1
Body
------------------
null

我想不通的是如何访问消息的正文

我希望它有内容some message\n

我在BasicHttpEntityEnclosingRequest 中找不到任何可以给我这个值的方法。在我使用的早期版本中

HttpRequest request = requestParser.parse();

而不是

BasicHttpEntityEnclosingRequest request = 
    (BasicHttpEntityEnclosingRequest) requestParser.parse();

我将其更改为BasicHttpEntityEnclosingRequest,因为它具有getEntity 方法。但这会返回null

所以我有点迷路了。

我在哪里可以找到尸体?

【问题讨论】:

    标签: java http apache-httpcomponents


    【解决方案1】:

    我已添加 Content-Length 标头,否则解析器将忽略 POST 正文。我已经修改了您的代码,现在它可以很好地解析正文:

    package org.apache.http.examples;
    
    import java.io.ByteArrayInputStream;
    import java.io.ByteArrayOutputStream;
    import java.io.IOException;
    import java.io.InputStream;
    import java.io.OutputStream;
    import java.net.Socket;
    
    import org.apache.http.Header;
    import org.apache.http.HttpException;
    import org.apache.http.message.BasicHttpEntityEnclosingRequest;
    import org.apache.http.util.EntityUtils;
    
    public class HttpExample {
    
        // RFC 822
    
        public static void main(String[] args) throws IOException, HttpException {
            String str = "POST http://localhost:8080/foobar/1234567 HTTP/1.1\n" +
                "Message-ID: <19815303.1075861029555.JavaMail.ss@kk>\n" +
                "Date: Wed, 6 Mar 2010 12:32:20 -0800 (PST)\n" +
                "From: someone@someotherplace.com\n" +
                "To: someone@someplace.com\n" +
                "Subject: some subject\n" +
                "Mime-Version: 1.0\n" +
                "Content-Type: text/plain; charset=us-ascii\n" +
                "Content-Transfer-Encoding: 7bit\n" +
                "X-From: one, some <some.one@someotherplace.com>\n" +
                "X-To: one\n" +
                "X-cc: \n" +
                "X-bcc: \n" +
                "X-Origin: Bob-R\n" +
                "X-FileName: rbob (Non-Privileged).pst\n" +
                "Content-Length: 13\n" +
                "\n" +
                "some message\n";
            ByteArrayInputStream fakeStream = new ByteArrayInputStream(
                    str.getBytes());
    
            BHttpConnectionBaseImpl b = new BHttpConnectionBaseImpl(fakeStream);
    
            BasicHttpEntityEnclosingRequest request1 = (BasicHttpEntityEnclosingRequest) b.receiveRequestHeader();
            b.receiveRequestEntity(request1);
    
    
            for (Header hdr : request1.getAllHeaders()) {
                System.out.println(String.format("%-30s = %s", hdr.getName(), hdr.getValue()));
            }
            System.out.println(String.format("Request Line: %s", request1.getRequestLine()));
            System.out.println(String.format("Body\n------------------\n%s",
                    EntityUtils.toString( request1.getEntity() ) ));
        }
    
    }
    
    class BHttpConnectionBaseImpl extends  org.apache.http.impl.DefaultBHttpServerConnection{
    
        private InputStream inputStream;
    
        public BHttpConnectionBaseImpl(final InputStream inputStream) {
            super(4048);
            this.inputStream = inputStream;
            try {
                super.bind(new Socket());
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    
        @Override
        protected InputStream getSocketInputStream(final Socket socket) throws IOException {
            return inputStream;
        }
    
        @Override
        protected OutputStream getSocketOutputStream(final Socket socket) throws IOException {
            return new ByteArrayOutputStream();
        }
    }
    

    POST body 的解析发生在org.apache.http.impl.BHttpConnectionBase.prepareInput(HttpMessage),无论它的唯一构造函数是谁protected 并且需要很多参数。子org.apache.http.impl.DefaultBHttpServerConnection 有一个方便的公共构造函数,并在receiveRequestHeader() 中进行标头解析。我重载的方法需要绕过一些错误检查,例如如果Socket == null 并且能够读取来自fakeStream 的请求

    另一种可能有效的方法是覆盖Socket,尤其是它的getInputStream()getOutputStream(),虽然我没有测试过。然后创建DefaultBHttpServerConnection 的实例并调用其bind 方法。其余的应该是一样的。

    【讨论】:

    • 我认为这是正确的答案,因为它被简化为必不可少的:Content-LengthDefaultBHttpServerConnectionTransfer-Encoding.getBytes() 都没有错。您可能想添加一点关于DefaultBHttpServerConnection 的注释以及为什么需要它?
    【解决方案2】:

    我认为问题可能是从您的消息标题中不清楚正文的长度是多少,因此接收者只是忽略了它。 HTTP specification 定义了几个关于如何传达此信息的选项,但似乎没有一个选项在这里应用:

    1. Content-Transfer-Encoding 必须是 Transfer-Encoding
    2. 7bit 不在the standard options 之中。
    3. 当您使用str.getBytes() 时,它会为您提供不是us-asciiContent-Type 中声明的UTF-16 字节。

    所以,我会稍微改变一下你的要求:

    1. 使用标题Content-Type: text/plain; charset=UTF-16
    2. 删除标题Content-Transfer-Encoding
    3. 添加Content-Lenght: 28(28 为"some message\n".getBytes().length())。

    【讨论】:

      【解决方案3】:

      查看 DefaultHttpRequestParser 的源代码,它似乎只解析请求行和标题,并没有尝试解析正文。

      This thread is discussing the same topic. There are few solution proposals as well.

      【讨论】:

        【解决方案4】:

        通过重写 LineParser 自定义解析头:

        inbuffer = new SessionInputBufferImpl(new HttpTransportMetricsImpl(), reqDataLength);
        inbuffer.bind(input);
        HttpMessageParser<org.apache.http.HttpRequest> requestParser = new DefaultHttpRequestParser(
                        inbuffer,
                        new LineParser(),
                        new DefaultHttpRequestFactory(),
                        MessageConstraints.DEFAULT
                );
        

        获取实体主体如下:

                HttpEntityEnclosingRequest ereq = (HttpEntityEnclosingRequest) req;
                ContentLengthStrategy contentLengthStrategy =
                            StrictContentLengthStrategy.INSTANCE;
                long len = contentLengthStrategy.determineLength(req);
                InputStream contentStream = null;
                if (len == ContentLengthStrategy.CHUNKED) {
                    contentStream = new ChunkedInputStream(buf);
                } else if (len == ContentLengthStrategy.IDENTITY) {
                    contentStream = new IdentityInputStream(buf);
                } else {
                    contentStream = new ContentLengthInputStream(buf, len);
                }
                BasicHttpEntity ent = new BasicHttpEntity();
                ent.setContent(contentStream);
                ereq.setEntity(ent);
                return ereq;
        

        【讨论】:

        • 通常最好评论一下为什么这会回答问题。
        • 上面的方法让我们可以很方便的自定义解析http头的方法,让我们可以通过HttpEntityEnclosureRequest获取body数据。
        猜你喜欢
        • 2014-06-14
        • 1970-01-01
        • 2015-07-10
        • 2017-01-22
        • 1970-01-01
        • 1970-01-01
        • 2016-04-24
        • 2015-04-12
        • 2012-01-30
        相关资源
        最近更新 更多