top of page
  • Writer's pictureWilliam Cameron

split string in c+

I have been doing coding Katas recently on codingame.com and codewars.com and something that i have noticed is that c++ doesn't have very good built in support for string splitting.


This is not that surprising but i find it unfortunate that with the changes they are making to the c++std there isn't a native way to easily do this perhaps something as simple as:

std::vector<iterator> spaces = std::find_all(begin(s), end(s), ' ');
 

Well, in case you wanted to know some ways of doing this in c++, i have made a list of ways you could do this, please do comment other ways you have thought of. all of which will output the following to the console:


std::stringstream

#include <string>
#include <sstream>
#include <iostream>
int main()
{
	using namespace std;
	
	string a{"this is a string of text to split"};
	stringstream stream(a);
	string split;
	int i{1};
	while (getline(stream, split, ' '))
	{
		cout << i << ") " << split << endl;
		++i;
	}
}

std::regex

#include <string>
#include <regex>
#include <iostream>
 
int main()
{
	using namespace std;
	
	string a{"this is a string of text to split"};
	regex whitespace(R"(\s+)");
	smatch match;
	string buffer{a};
	int i{1};
	while (regex_search(buffer, match, whitespace))
	{
		cout << i << ") " << match.prefix() << endl;
		buffer = match.suffix();
		++i;
	}
	cout << i << ") " << buffer << endl;
}

std::string.find()

#include <string>
#include <iostream>
 
int main()
{
	using namespace std;
 
	string a{"this is a string of text to split"};
	size_t begin{0};
	auto pos{a.find(' ', begin)};
	int i{1};
	while (pos != string::npos)
	{
		cout << i << ") " << a.substr(begin, pos - begin) << endl;
		begin = pos + 1;
		++i;
		pos = a.find(' ', begin);
	}
	cout << i << ") " << a.substr(begin) << endl;
}
 

I think those are the main ones. stringstream seems to be the least verbose. but which is faster?

I did a basic benchmark, i ran all these methods 1 million times and timed them the results are as follows:

MinGW x64 windows (Debug binaries):

std::stringstream took 1671ms

std::regex took 15994.1ms

std::string.find() took 1189.95ms


MinGW x64 windows (Release binaries):

std::stringstream took 907.998ms

std::regex took 3199.08ms

std::string.find() took 366.002ms


MSVC 14 (2017) x64 (Release binaries)

std::stringstream took 1117.02ms

std::regex took 6823.93ms

std::string.find() took 536.398ms


MSVC 14 (2017) x64 (Debug binaries)

std::stringstream took 74282.3ms

std::regex took 660995ms

std::string.find() took 76841.8ms


MSVC in debug binaries was so horrifyingly slow that i did it 1.5 times instead of 3 like the others.

But as you can see string.find() is consistently faster than the other methods, but because of stringstream being much less verbose, I think I will default to using that for readability.




8 views0 comments

Kommentare


bottom of page