需要在WebFrom中实现一个网页抓取功能,里面用到了System.Windows.Forms.WebBrowser控件,功能已经实现,用VS IISExpress运行没问题,可以取到数据。但一部署到IIS上就出问题了,请各位大神帮忙指导一下,谢谢!
核心代码
static System.Windows.Forms.WebBrowser wb; private static string url ="http://www.baidu.com"; private static List<MyCount> myCounts = new List<MyCount>(); [STAThread] protected void Page_Load(object sender, EventArgs e) { try { System.Threading.Thread t = new System.Threading.Thread(new ThreadStart(() => { wb = new System.Windows.Forms.WebBrowser(); wb.DocumentCompleted += wb_DocumentCompleted; wb.Navigate(url); while (wb.ReadyState != System.Windows.Forms.WebBrowserReadyState.Complete) { System.Windows.Forms.Application.DoEvents(); //避免假死,若去掉则可能无法触发 DocumentCompleted 事件。 } }) ); t.SetApartmentState(ApartmentState.STA); t.Start(); } catch (Exception exception) { Response.Write(exception.Message); } } void wb_DocumentCompleted(object sender, System.Windows.Forms.WebBrowserDocumentCompletedEventArgs e) { myCounts = new List<MyCount>(); if (wb.ReadyState == WebBrowserReadyState.Complete && wb.IsBusy == false) { HtmlDocument doc = wb.Document; //抓取网页 HtmlElement hem = doc.GetElementById("list");//这里就像js里面一样通过ID来查找对象 for (int i = 0; i < hem.Children.Count; i++) { string innertext = hem.Children[i].InnerText.Trim(); string[] temps = innertext.Split(' '); myCounts.Add(new MyCount() { Name = temps[0], Number = temps[1] }); } } else { Response.Write(wb.ReadyState.ToString()); } }
错误:
跨线程了?
但是在本地是好的
这种用法不是推荐用法吧,实在需要如此用需要把IIS运行池的服务账号换成Administrator试试,IE的控件需要很高的权限才行
HtmlAgilityPack这个开源库是否可以满足你的场景
换成Administrator也不行,HtmlAgilityPack满足不了
@落日赌城: http://stackoverflow.com/questions/2071930/c-sharp-asp-net-use-windows-forms-webbrowser 最好还是换个形式,必须这样的话,从STA MTA上找原因,因为Winform是STA的运行模式