用到正则表达式、HttpWebRequest的一些网络知识。看一下下边的例子、很简单啊。
Regex regex;
string[] weather = new string[5];
string content = "";
Match mcTmp;
Match mcCity;
int k = 1;
HttpWebResponse theResponse;
WebRequest theRequest;
theRequest = WebRequest.Create("http://weather.news.qq.com/inc/ss82.htm");
try
{
theResponse = (HttpWebResponse)theRequest.GetResponse();
using (System.IO.Stream sm = theResponse.GetResponseStream())
{
System.IO.StreamReader read = new System.IO.StreamReader(sm, Encoding.Default);
content = read.ReadToEnd();
}
}
catch (Exception)
{
content = "";
}
string parttenTmp = "<td height=\"23\" width=\"117\" background=\"/images/r_tembg5.gif\" align=\"center\">(?<item1>[^<]+)</td>";
k = 1;
regex = new Regex(parttenTmp, RegexOptions.Compiled | RegexOptions.IgnoreCase);
for (mcTmp = regex.Match(content), k = 1; mcTmp.Success; mcTmp = mcTmp.NextMatch(), k++)
{
weather[0] = mcTmp.Groups["item1"].Value;
}
parttenTmp = "height=\"23\" align=\"center\">(?<item1>[^/]+)</td>";
k = 1;
regex = new Regex(parttenTmp, RegexOptions.Compiled | RegexOptions.IgnoreCase);
for (mcTmp = regex.Match(content), k = 1; mcTmp.Success; mcTmp = mcTmp.NextMatch(), k++)
{
weather[k] = mcTmp.Groups["item1"].Value;
}
return weather;
这个是以前写的一个抓取天气预报的方法.你可以参考一下。
创建一个HttpWebRequest对象,向登陆程序post用户名和密码,获取登陆的cookie,然后再用HttpWebRequest请求要抓取的页面.
模拟登陆验证就可以了
来晚了。
方法太多了,可以用cookies,Session,application都可以实现