c# 使用正则表达式来获得数据

悬赏园豆：10 [已解决问题] 解决于 2010-09-10 11:24

c# 使用正则表达式来获得数据

打个比方：aa bb cc

原始数据：

在这个数据当中我要获取

1.http://news.naver.com/main/read.nhnmode=LSD&mid=shm&sid1=102&oid=003&aid=0003428615

2.bb

3.cc

希望各位能帮帮小弟！！！

.NET技术 C#

nothing better | 菜鸟二级 | 园豆：235
提问于：2010-09-09 18:08

< >

最佳答案

Regex re = new Regex(@"<li\s*?>.*href=['""](.*)['""].*<strong.*?>(.*)</strong>\s*?</a>.*<span.*?>(.*)</span>.*?</li>", RegexOptions.None);
string result = re.Replace("$1 $2 $3");

收获园豆：2

青争竹马 | 大侠五级 |园豆：5874 | 2010-09-10 09:27

谢谢！～！

nothing better | 园豆：235 (菜鸟二级) | 2010-09-10 11:23

非常感谢您的帮助～！

nothing better | 园豆：235 (菜鸟二级) | 2010-09-10 11:31

请大虾能否解释一下string result = re.Replace("$1 $2 $3");的用处！下面是我的代码： Regex re = new Regex(@"<li\s*?>.*href=['""](.*)['""].*<strong.*?>(.*)\s*?</a>.*<span.*?>(.*).*?</li>", RegexOptions.None); MatchCollection mc2 = re.Matches(textBox1.Text); if (mc2.Count > 0) { foreach (Match m in mc2) { listBox1.Items.Add(m.Groups[1].Value); } }

nothing better | 园豆：235 (菜鸟二级) | 2010-09-10 11:42

$1 $2 $3分别代表用第一个括号“（）”、第二个括号...去替换分别的位置，在正则表达式中称为“后向引用”。http://www.yesky.com/imagesnew/software/vbscript/html/reconBackreferences.htm

青争竹马 | 园豆：5874 (大侠五级) | 2010-09-10 20:18

谢谢您的帮助～！呵呵·！非常感谢你对我的帮助～！让我学习了～！

nothing better | 园豆：235 (菜鸟二级) | 2010-09-13 07:13

其他回答(3)

HTML这些标签是死的吧？

Astar | 园豆：40805 (高人七级) | 2010-09-09 20:55

我试过了。。。你的方法不行！！！！！但是非常感谢你的帮助！！

支持(0) 反对(0) nothing better | 园豆：235 (菜鸟二级) | 2010-09-10 07:17

这个不是死标签！！

支持(0) 反对(0) nothing better | 园豆：235 (菜鸟二级) | 2010-09-10 08:17

@nothing better:或许是我理解错了，这个是从HTML取所有符合条件的LI，然后你再获取href和aa,bb.

支持(0) 反对(0) Astar | 园豆：40805 (高人七级) | 2010-09-10 09:11

你做的是web吧！你可以通过RegularExpressionValidator控件来控制的啊

js灰灰 | 园豆：7 (初学一级) | 2010-09-09 22:23

我不是再做web，是在做application。我想用正则表达式来获取URL和标题的应用程序。。。以便我查询！

支持(0) 反对(0) nothing better | 园豆：235 (菜鸟二级) | 2010-09-10 07:16

代码

using System;
using System.Text;
using System.Text.RegularExpressions;
public static void Main(string[] args)
        {
            string html0 = "<li><a href=\"http://news.naver.com/main/read.nhnmode=LSD&mid=shm&sid1=102&oid=003&aid=0003428615\"><strong>bb</strong></a><span class=\"writing\">cc</span></li>";
            string regex0 = @"<[^>]+>";
            SplitHtmlString(html0, regex0);

           Console.ReadLine();
       }

   static void SplitHtmlString(string html, string regex)
        {
            //分离事件
            Regex splitRegex = new Regex(regex, RegexOptions.IgnoreCase);
            String[] splitResults;
            splitResults = splitRegex.Split(html);
            //分离结束
            //匹配事件
            System.Text.RegularExpressions.MatchCollection matchesFound;
            System.Text.RegularExpressions.Regex matchesRegex = new System.Text.RegularExpressions.Regex(regex, RegexOptions.IgnoreCase);
            matchesFound = matchesRegex.Matches(html);
            System.Text.StringBuilder resultString = new System.Text.StringBuilder();
            string url = matchesFound[1].ToString();
            //<a href="http://news.naver.com/main/read.nhnmode=LSD&mid=shm&sid1=102&oid=003&aid=0003428615"/>
            url = GetPureURL3(url);
            resultString.AppendLine(url);
            resultString.AppendLine(splitResults[3].ToString());
            resultString.AppendLine(splitResults[6].ToString());
            Console.Write(resultString.ToString());

        }

 //<a href="http://news.naver.com/main/read.nhnmode=LSD&mid=shm&sid1=102&oid=003&aid=0003428615"/>
        static string GetPureURL3(string sHtml)
        {
            try
            {
                string[] result = sHtml.Split('"');
                return result[1].ToString();
            }
            catch { return sHtml; }

        }

收获园豆：8

邀月 | 园豆：25475 (高人七级) | 2010-09-10 09:56

非常感谢大虾的帮助！！！完美解决了！！！再一次感谢！～！

支持(0) 反对(0) nothing better | 园豆：235 (菜鸟二级) | 2010-09-10 11:23

@nothing better:哈哈。解决问题就好！

支持(0) 反对(0) 邀月 | 园豆：25475 (高人七级) | 2010-09-10 11:28

@邀月:呵呵～！是啊～！谢谢～！

支持(0) 反对(0) nothing better | 园豆：235 (菜鸟二级) | 2010-09-10 11:35

清除回答草稿

您需要登录以后才能回答，未注册用户请先注册。