您的位置:首页 > 其它

strstr函数

2015-10-25 14:53 525 查看
引子:

给出字符串str1, str2,判断str2是否为str1的子字符串,如果是,返回str2在str1中对应的起始地址。

思路:首先计算出str2字符串长度len2, 然后从字符串str1第一个字符开始,依次判断连续len2个字符串是否与str2相同;如果不同,则从字符串str1第二个字符开始,依次判断连续len2个字符串是否与str2相同.....,如果相同,返回str1对应str2相同字符串起始地址。

linux函数

char *strstr(const char *haystack, const char *needle);

头文件<string.h>

返回值:char * ,失败返回NULL

函数作用:

判断字符串str1是否包含给定的字符串str2。在模糊匹配时能够用到。比如数据库查询select * from tablename where query like '%select%';  某一行是否包含某个词等场景。

该strstr具体实现可以查看百度百科strstr函数。

另外字符串比较算法KMP可以用来实现strstr函数,该KMP算法在本blog有介绍,在这里不做阐述。根据KMP算法实现,假如str2字符串没有与开头相似的子字符串,其效率应该不会优于直接比较算法。有关算法比较见本文最后部分。

示例:

#include <stdlib.h>
#include <string.h>
#include <stdio.h>

int main()
{
const char* str1="abcd12345";
const  char* substr2="123";
const char* substr3="125";

char *p=strstr(str1,substr2);
if(p)  //if(p != NULL)
{
fprintf(stdout,"%s\n",p);
fflush(stdout);
}
else
fprintf(stderr,"not find.\n");

p=strstr(str1,substr3);
if(p)
fprintf(stdout,"%s\n",p);
else
fprintf(stderr,"not find.\n");
}


man strstr帮助信息

iZ232ngsvp8Z:~/tmp # man strstr
STRSTR(3)                                            Linux Programmer's Manual                                           STRSTR(3)

NAME
strstr, strcasestr - locate a substring

SYNOPSIS
#include <string.h>

char *strstr(const char *haystack, const char *needle);

#define _GNU_SOURCE

#include <string.h>

char *strcasestr(const char *haystack, const char *needle);

DESCRIPTION
The  strstr()  function  finds  the  first occurrence of the substring needle in the string haystack.  The terminating '\0'
characters are not compared.

The strcasestr() function is like strstr(), but ignores the case of both arguments.

RETURN VALUE
These functions return a pointer to the beginning of the substring, or NULL if the substring is not found.

CONFORMING TO
The strstr() function conforms to C89 and C99.  The strcasestr() function is a non-standard extension.

BUGS
Early versions of Linux libc (like 4.5.26) would not allow an empty needle argument for  strstr().   Later  versions  (like
4.6.27) work correctly, and return haystack when needle is empty.

SEE ALSO
index(3),  memchr(3),  rindex(3),  strcasecmp(3),  strchr(3),  strpbrk(3), strsep(3), strspn(3), strtok(3), wcsstr(3), fea-
ture_test_macros(7)

COLOPHON
This page is part of release 3.15 of the Linux man-pages project.  A description of  the  project,  and  information  about
reporting bugs, can be found at http://www.kernel.org/doc/man-pages/. 
GNU                                                         2005-04-05                                                   STRSTR(3)


KMP算法与strstr算法比较

//KMP算法

#include <stdio.h>
#include <string.h>
#include<stdlib.h>
#include <sys/time.h>
#include <time.h>

//子字符串P[]对应的int数组next[]

void getNext(const char *P, int next[])
{
int N=strlen(P);
next[0]=0;
next[1]=0;
int k=0;
int i=0;
for(i=2;i<N;i++)
{
k=next[i-1];
while(1)
{
if(P[i-1]==P[k])
{
next[i]=k+1;
break;
}
else
k=next[k];
if(k==0)
{
next[i]=k;
break;
}
}

}

}

//利用KMP算法,从字符串T的第pos个字符起(pos>=0),验证字符串P是否为T的子字符串,如果是返回P第一个字母对应T的字符的
//位置,否则返回-1
int KmpCmp(const char *T,const char *P,int pos)
{
int len=strlen(P);
int *next=(int *)malloc(len*sizeof(int));
getNext(P,next);
int r=0;
/*for(r=0;r<len;r++)
{
fprintf(stdout,"%d ",next[r]);
}
*/
//fprintf(stdout,"\n");

int i=pos,j=0;
while(T[i]!='\0' && P[j]!='\0')
{
if(T[i]==P[j])
{
i++;
j++;
}
else
{       if(j==0)
i++;
j=next[j];
//fprintf(stdout,"j=%d\n",j);
}

}
if(P[j]=='\0')
{
free(next);
return i-len;
}
else
{
free(next);
return -1;
}

}

int GetTime()
{
struct timeval tv;
gettimeofday(&tv, NULL);
return tv.tv_sec * 1000000 + tv.tv_usec;
}

int main(int argc, char* argv[])
{
//char T[]="adhelloworld";
//char P[]="dh";
const int len1=1024000;
const int len2=1000;
char T[len1];
char P[len2];
int i=0;
srand(101);
for(i=0;i<len1;i++)
T[i]=rand()%10+'0';
for(i=0;i<len2;i++)
//P[i]=rand()%10+'0';
P[i]=T[i+len1-len2-23];

//test
/*       for(i=0;i<len2;i++)
{
T[i]=i+'0';
P[i]=i+'0';

}
*/
P[len2]=0;
T[len1]=0;
//printf("%s\n",T);
//printf("%s\n",P);

int startTime=GetTime();
int k=KmpCmp(T,P,0);
fprintf(stdout,"result: %d\n",k);
fprintf(stdout,"time: %d\n",GetTime()-startTime);

//strstr
startTime=GetTime();
char *p=strstr(T,P);
fprintf(stdout,"time: %d\n",GetTime()-startTime);
return 0;
}


iZ232ngsvp8Z:~/tmp # ./kmp1

result: 1022977

time: 8445

time: 455

说明strstr自带的算法还是比较厉害的,当然KMP算法不同的人写可能性能上有差距,但是不应该是不同级别的差距。

temp
https://github.com/postgres/postgres/commit/fd4ced5230162b50a5c9d33b4bf9cfb1231aa62e http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=fd4ced5230162b50a5c9d33b4bf9cfb1231aa62e
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: