STL Split String
2010-05-19 17:45
344 查看
Description
Below is a function I created and have found extremely useful for splitting strings based on a particular delimiter. The implementation only requires STL which makes it easy to port to any OS that supports STL. The function is fairly lightweight althoughI haven't done extensive performance testing.
The delimiter can be n number of characters represented as a string. The parts of the string in between the delimiter are then put into a string vector. The class
StringUtilscontains one
staticfunction
SplitString. The
intreturned is the number of delimiters found within the input string.
I used this utility mainly for parsing strings that were being passed across platform boundaries. Whether you are using raw sockets or middleware such as TIBCO� it is uncomplicated to pass string data. I found it more efficient to pass delimited string data
verses repeated calls or messages. Another place I used this was in passing
BSTRs back and forth between a Visual Basic client and an ATL COM DLL. It proved to be easier than passing a
SAFEARRAYas an [in] or [out] parameter. This was also beneficial when I did not want the added overhead of MFC and hence could not use
CString.
Implementation
TheSplitStringfunction uses the STL string functions
findand
substrto iterate through the input string. The hardest part was figuring out how to get the substring of the input string
based on the offsets of the delimiter, not forgetting to take into account the length of the delimiter. Another hurdle was making sure not to call
substrwith an offset greater than the length of the input string.
Header
Collapse
Copy Code
#ifndef __STRINGUTILS_H_ #define __STRINGUTILS_H_ #include <string> #include <vector> using namespace std; class StringUtils { public: static int SplitString(const string& input, const string& delimiter, vector<string>& results, bool includeEmpties = true); }; #endif
Source
Collapse
Copy Code
int StringUtils::SplitString(const string& input,
const string& delimiter, vector<string>& results,
bool includeEmpties)
{int iPos = 0;int newPos = -1;int sizeS2 = (int)delimiter.size();int isize = (int)input.size();
if(
( isize == 0 )
||
( sizeS2 == 0 )
)
{
return 0;
}
vector<int> positions;
newPos = input.find (delimiter, 0);
if( newPos < 0 )
{
return 0;
}
int numFound = 0;
while( newPos >= iPos )
{
numFound++;
positions.push_back(newPos);
iPos = newPos;
newPos = input.find (delimiter, iPos+sizeS2);
}
if( numFound == 0 )
{
return 0;
}
for( int i=0; i <= (int)positions.size(); ++i )
{
string s("");
if( i == 0 )
{
s = input.substr( i, positions[i] );
}int offset = positions[i-1] + sizeS2;
if( offset < isize )
{
if( i == positions.size() )
{
s = input.substr(offset);
}
else if( i > 0 )
{
s = input.substr( positions[i-1] + sizeS2,
positions[i] - positions[i-1] - sizeS2 );
}
}
if( includeEmpties || ( s.size() > 0 ) )
{
results.push_back(s);
}
}
return numFound;
}
Output using demo project
Collapse
Copy Code
main.exe "|mary|had|a||little|lamb||" "|"
int SplitString(
const string& input,
const string& delimiter,
vector<string>& results,
bool includeEmpties = true
)
-------------------------------------------------------
input = |mary|had|a||little|lamb||
delimiter = |
return value = 8 // Number of delimiters found
results.size() = 9
results[0] = ''
results[1] = 'mary'
results[2] = 'had'
results[3] = 'a'
results[4] = ''
results[5] = 'little'
results[6] = 'lamb'
results[7] = ''
results[8] = ''
int SplitString(
const string& input,
const string& delimiter,
vector<string>& results,
bool includeEmpties = false
)
-------------------------------------------------------
input = |mary|had|a||little|lamb||
delimiter = |
return value = 8 // Number of delimiters found
results.size() = 5
results[0] = 'mary'
results[1] = 'had'
results[2] = 'a'
results[3] = 'little'
results[4] = 'lamb'
MFC version
For those of you who absolutely cannot use STL and are committed to MFC I made a few minor changes to the above implementation. It usesCStringinstead of
std::stringand a
CStringArrayinstead of a
std::vector:
Collapse
Copy Code
//------------------------
// SplitString in MFC
//------------------------int StringUtils::SplitString(const CString& input,
const CString& delimiter, CStringArray& results)
{int iPos = 0;int newPos = -1;int sizeS2 = delimiter.GetLength();int isize = input.GetLength();
CArray<INT, int> positions;
newPos = input.Find (delimiter, 0);
if( newPos < 0 ) { return 0; }
int numFound = 0;
while( newPos > iPos )
{
numFound++;
positions.Add(newPos);
iPos = newPos;
newPos = input.Find (delimiter, iPos+sizeS2+1);
}
for( int i=0; i <= positions.GetSize(); i++ )
{
CString s;
if( i == 0 )
s = input.Mid( i, positions[i] );
else
{int offset = positions[i-1] + sizeS2;
if( offset < isize )
{
if( i == positions.GetSize() )
s = input.Mid(offset);
else if( i > 0 )
s = input.Mid( positions[i-1] + sizeS2,
positions[i] - positions[i-1] - sizeS2 );
}
}
if( s.GetLength() > 0 )
results.Add(s);
}
return numFound;
}
String neutral version
I added this version in case you might need to use it with any type of string. The only requirement is the string class must have a constructor that takes achar*. The code only depends on the STL vector. I also added the option to not include empty strings in
the results, which will occur if delimiters are adjacent:
Collapse
Copy Code
//-----------------------------------------------------------
// StrT: Type of string to be constructed
// Must have char* ctor.
// str: String to be parsed.
// delim: Pointer to delimiter.
// results: Vector of StrT for strings between delimiter.
// empties: Include empty strings in the results.
//-----------------------------------------------------------
template< typename StrT >int split(const char* str, const char* delim,
vector<StrT>& results, bool empties = true)
{
char* pstr = const_cast<char*>(str);
char* r = NULL;
r = strstr(pstr, delim);int dlen = strlen(delim);
while( r != NULL )
{
char* cp = new char[(r-pstr)+1];
memcpy(cp, pstr, (r-pstr));
cp[(r-pstr)] = '/0';
if( strlen(cp) > 0 || empties )
{
StrT s(cp);
results.push_back(s);
}
delete[] cp;
pstr = r + dlen;
r = strstr(pstr, delim);
}
if( strlen(pstr) > 0 || empties )
{
results.push_back(StrT(pstr));
}
return results.size();
}
String neutral usage
Collapse
Copy Code
// using CString
//------------------------------------------int i = 0;
vector<CString> results;
split("a-b-c--d-e-", "-", results);
for( i=0; i < results.size(); ++i )
{
cout << results[i].GetBuffer(0) << endl;
results[i].ReleaseBuffer();
}
// using std::string
//------------------------------------------
vector<string> stdResults;
split("a-b-c--d-e-", "-", stdResults);
for( i=0; i < stdResults.size(); ++i )
{
cout << stdResults[i].c_str() << endl;
}
// using std::string without empties
//------------------------------------------
stdResults.clear();
split("a-b-c--d-e-", "-", stdResults, false);
for( i=0; i < stdResults.size(); ++i )
{
cout << stdResults[i].c_str() << endl;
}
Conclusion
Hope you find this as useful as I did. Feel free to let me know of any bugs or enhancements. Enjoy ;)License
This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.A list of licenses authors might use can be found
here
About the Author
Paul J. Weiss Member |
|
欢迎访问:乐园www.ly8.co
相关文章推荐
- STL 中的 std::string大小写转换 lowercase、uppercase、Trim、replace、split
- C++中STL对string进行trim,split,replace操作
- c++使用 STL string 实现split,trim和replace方法
- stl string 分解 split
- split STL string by chars
- stl string 的 trim split replace tolower toupper
- STL之string
- C++STL之string
- 简单模拟STL库中string的实现
- commons-lang包的StringUtils.split()和jdk自带split()的区别
- STL_字符串_【string】
- String.split方法中的特殊字符问题
- C#中解决在STRING.SPLIT()中不能用字符串分割另一字符串的问题
- String.split()用法
- C#的String.Split方法
- memset on stl string
- how to split string in c++
- java.lang.string split 以点分割字符串无法正常拆分字符串
- String.split()
- STL容器--string