突然看到首页出现两篇题目极为相似的文章,一看,居然还是不同作者,看了幽梦新影兄的执行结果,突然想起自己好像也有用C语言写过类似代码。(纯属凑热闹了)
《<<字符串高级截取和统计>>一文的看法与正则表达式的实现》
《字符串高级截取和统计》(补充一下,尽然又发现一篇~)
不过不知道为什么遇到这样的功能大家就一定要想到“正则实现”呢?曾经参加微软的一个第三方培训的时候得知正则表达式的效率相当的低下,大部分的性能瓶颈均出于此,不过那个培训是高性能培训。
但值得肯定的是这个正则表达式绝对是一个优秀的实践。很多纷繁复杂的查找匹配规则通过正则表达式立刻就能迅速求解了。
下面这个C语言程序是我以前写着玩的程序,刚才突然想到了,就贴出来一同完善这个解决方案。不过不难看出用传统的字符构成字符串的思路,要写完这个类似的功能,工作量还是相当巨大的。但是用正则表达式则能够让代码简洁优美。至于效率,我个人认为其实完全可以忽略不计。
不过我这个代码的实现好像有附加了其它的一些功能,不过没关系,只增不减。
1 /*
2 * textsearch.h
3 *
4 * Created on:
5 * Author: Volnet
6 * Website: http://volnet.cnblogs.com
7 *
8 */
9
10 #ifndef TEXTSEARCH_H_
11 #define TEXTSEARCH_H_
12
13 #include <stdarg.h>
14
15 #ifndef INDEX_T_DEFINDED
16 typedef int index_t;
17 #define INDEX_T_DEFINDED
18 #endif
19
20 #ifndef COUNT_T_DEFINDED
21 typedef unsigned int count_t;
22 #define COUNT_T_DEFINDED
23 #endif
24
25 /*
26 * @return
27 * offset: the number of chars that
28 * from you found to the end of the word
29 * after you execute the search_foreach function.
30 * e.g.
31 * if you replace("abcdefgh", "de", "123");//abc45fgh;
32 * you had set offset as 3,
33 * which will let the pointer at the next char 'f';
34 * */
35 typedef void (*search_foreach)(const char *text,
36 const char *found, index_t lengthOfFound,
37 index_t *offset,
38 va_list *paras);
39 typedef void (*search_global)(va_list *paras, void *ret);
40 #define UNFINDED (index_t)-1
41 #define MAX_INDICES 1000;
42
43 /*declartion of functions*/
44
45 /*function :To find the first char(c) in text
46 * @paras
47 * text :a pointer to the first char of the text to search
48 * c :the char to find.
49 * out_indices :(return) an array takes the indices of the tofind in the text.
50 * (value) if(out_indices == NULL), it without any exceptions.
51 * tofind :the word to find in the text.
52 * @return :the index of the char in the text.It will be 'UNFINDED' as no find.
53 * */
54 index_t indexof(const char *text, char c);
55 count_t search_c(const char *text, const char c, index_t *out_indices);
56 count_t search_s(const char *text, const char *tofind, index_t *out_indices);
57 count_t search_s_foreach(
58 /*the full text to search.*/
59 const char *text,
60 /*the word to find in the text.*/
61 const char *tofind,
62 /*(return) an array takes the indices of the tofind in the text.
63 * (value) if(out_indices == NULL), it without any exceptions.
64 * */
65 index_t *out_indices,
66 /*a function to gfunc(
) for the text.
67 * it will execute at the front of all.
68 * */
69 search_global gfunc,
70 /*the number of gfunc's parameters. */
71 const size_t gfunc_paras_count,
72 /*a function to func(
) for each of
73 * the word search_s from text.
74 * */
75 search_foreach func,
76 /*if gfunc!=NULL, the first parameter is the {char *ret; }
77 * ret is the parameter for return.
78 * */
79
80 );
81 /*
82 * function :replace the word 's1' to 's2' in text.
83 * @return :
84 * ret: an array who has enough space to load the result.
85 * text != ret
86 * */
87 void replace(const char *text, const char *s1, const char *s2, char ret[]);
88 /* function :remove the word 's' in text.
89 * @return :
90 * ret: an array who has enough space to load the result.
91 * text != ret
92 * */
93 void to_remove(char *text, const char *s, char ret[]);
94
95 #endif /* TEXTSEARCH_H_ */
96
2 * textsearch.h
3 *
4 * Created on:
5 * Author: Volnet
6 * Website: http://volnet.cnblogs.com
7 *
8 */
9
10 #ifndef TEXTSEARCH_H_
11 #define TEXTSEARCH_H_
12
13 #include <stdarg.h>
14
15 #ifndef INDEX_T_DEFINDED
16 typedef int index_t;
17 #define INDEX_T_DEFINDED
18 #endif
19
20 #ifndef COUNT_T_DEFINDED
21 typedef unsigned int count_t;
22 #define COUNT_T_DEFINDED
23 #endif
24
25 /*
26 * @return
27 * offset: the number of chars that
28 * from you found to the end of the word
29 * after you execute the search_foreach function.
30 * e.g.
31 * if you replace("abcdefgh", "de", "123");//abc45fgh;
32 * you had set offset as 3,
33 * which will let the pointer at the next char 'f';
34 * */
35 typedef void (*search_foreach)(const char *text,
36 const char *found, index_t lengthOfFound,
37 index_t *offset,
38 va_list *paras);
39 typedef void (*search_global)(va_list *paras, void *ret);
40 #define UNFINDED (index_t)-1
41 #define MAX_INDICES 1000;
42
43 /*declartion of functions*/
44
45 /*function :To find the first char(c) in text
46 * @paras
47 * text :a pointer to the first char of the text to search
48 * c :the char to find.
49 * out_indices :(return) an array takes the indices of the tofind in the text.
50 * (value) if(out_indices == NULL), it without any exceptions.
51 * tofind :the word to find in the text.
52 * @return :the index of the char in the text.It will be 'UNFINDED' as no find.
53 * */
54 index_t indexof(const char *text, char c);
55 count_t search_c(const char *text, const char c, index_t *out_indices);
56 count_t search_s(const char *text, const char *tofind, index_t *out_indices);
57 count_t search_s_foreach(
58 /*the full text to search.*/
59 const char *text,
60 /*the word to find in the text.*/
61 const char *tofind,
62 /*(return) an array takes the indices of the tofind in the text.
63 * (value) if(out_indices == NULL), it without any exceptions.
64 * */
65 index_t *out_indices,
66 /*a function to gfunc(
67 * it will execute at the front of all.
68 * */
69 search_global gfunc,
70 /*the number of gfunc's parameters. */
71 const size_t gfunc_paras_count,
72 /*a function to func(
73 * the word search_s from text.
74 * */
75 search_foreach func,
76 /*if gfunc!=NULL, the first parameter is the {char *ret; }
77 * ret is the parameter for return.
78 * */
79
80 );
81 /*
82 * function :replace the word 's1' to 's2' in text.
83 * @return :
84 * ret: an array who has enough space to load the result.
85 * text != ret
86 * */
87 void replace(const char *text, const char *s1, const char *s2, char ret[]);
88 /* function :remove the word 's' in text.
89 * @return :
90 * ret: an array who has enough space to load the result.
91 * text != ret
92 * */
93 void to_remove(char *text, const char *s, char ret[]);
94
95 #endif /* TEXTSEARCH_H_ */
96
下面是调用代码:
1
一条正则表达式可以搞定的东西,我们需要写这么多的代码才能够实现,相形见拙就此体现。
下面是输出结果:
the first index of w is 10.
the 'w' in the text is in : 10 16 30 .
the "wh" in the text is in : 16 30 .
the word is:
this is a word, who contains "wh"!
the word has be changed as ("wh"->"jonson") :
this is a word, jonsono contains "jonson"!
the word remove (remove "onson") as :
this is a word, jo contains "j"!
the 'w' in the text is in : 10 16 30 .
the "wh" in the text is in : 16 30 .
the word is:
this is a word, who contains "wh"!
the word has be changed as ("wh"->"jonson") :
this is a word, jonsono contains "jonson"!
the word remove (remove "onson") as :
this is a word, jo contains "j"!