1.做了一个简单页面index.html,放在wwwroot下:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Bootstrap b6test</title>
</head>
<body>
ok
</body>
</html>
2百度站长工具抓取测试情况
(1)PC抓取情况
ok
返回HTTP头:
HTTP/1.1 200 OK Server: nginx/1.11.1 Date: Fri, 13 Jan 2017 09:03:41 GMT Content-Type: text/html Content-Length: 627 Connection: close Last-Modified: Fri, 13 Jan 2017 09:01:47 GMT Accept-Ranges: bytes ETag: "1d26d7bab544df3" X-Powered-By: ASP.NET
抓取网页内容(只展现前200K):
<!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <title>Bootstrap b6test</title> <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0-alpha.6/css/bootstrap.min.css" integrity="sha384-rwoIResjU2yc3z8GV/NPeZWAv56rSmLldC3R/AZzGRnGxQQKnKkoFVhFQhNUwEyJ" crossorigin="anonymous"> <!-- Latest compiled and minified JavaScript --> <script src="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0-alpha.6/js/bootstrap.min.js" integrity="sha384-vBWWzlZJ8ea9aCX4pEW3rVHjgjt7zpkNpZk+02D9phzyeVkE+jo0ieGizqPLForn" crossorigin="anonymous"></script> </head> <body> ok </body> </html>
(2) 移动抓取测试情况
抓取异常信息: | 访问参数错误 查看帮助 |
返回HTTP头:
HTTP/1.1 400 Bad Request Server: nginx/1.11.1 Date: Fri, 13 Jan 2017 10:00:37 GMT Content-Length: 0 Connection: close X-Powered-By: ASP.NET
3.用asp .net mvc 5.2网站移动抓取 测试情况
OK
所以排除了被测试页面有问题,只能是asp .net core 网站的问题。
有谁知道这是什么原因没有?
400 Bad Request
Server: nginx/1.11.1
看日志把请求报文完整的记录下来看看就知道了
建议看一下ASP.NET Core的日志中对应的错误
有个参数是否允许爬虫的
求教详细情况
问题有解决吗?我也遇到了这个情况,似乎是ua中含有特殊字符。我测试了下cookies如果放些特殊字符也会直接报400。楼主是怎么解决的 方便分享下吗?