Is there some reason that splitWhitespace() from strutils lacks maxsplit parameter? It is implemented there as

iterator splitWhitespace*(s: string): string =
  ## Splits at whitespace.
  oldSplit(s, Whitespace, -1)
using oldSplit() that does have this parameter. So, why not

iterator splitWhitespace*(s: string, maxsplit: int = -1): string =
  ## Splits at whitespace.
  oldSplit(s, Whitespace, maxsplit)
?

2017-10-12 20:17:22

Use split proc

proc split(s: string; seps: set[char] = Whitespace; maxsplit: int = - 1): seq[string] {..}

to use

from strutils import split

for token in "My string when splitted".split(maxsplit = 1):
  echo token

2017-10-13 07:28:51

split on Whitespace and splitWhitespace are not equivalent:

from strutils import split, splitWhitespace
let s = "  a couple of \t words "
echo s.split.len            # prints 9
echo s.splitWhitespace.len  # prints 4
In case of leading whitespace split(maxsplit = 1)[0] is empty string, while I expect splitWhitespace(maxsplit = 1) to be the first non-whitespace token in the string. Sure, one can strip the leading whitespace before using split(), etc.

But as I see, all the functionality for splitWhitespace(maxsplit = <something>) is already there, it is just not exposed via public splitWhitespace interface.

The question is "why?" Is it buggy or what?

2017-10-13 08:05:55

The optional maxsplit parameter was added to strutils.splitWhitespace.

For details see: https://github.com/nim-lang/Nim/issues/6503

2017-11-08 12:11:54

@olwi

I'd say it looks like a bug. I always use split though.

2017-11-08 23:13:15

@Udiknedormin

What exactly looks like a bug?

2017-11-08 23:21:09

How split can't behave like splitWhitespace. I guess it should and just be more general.

Well, there is a similar library in Fortran. If I recall, there is a function a little similar to split (it's also an iterator). It separates the concept of a separator characters and unmatched characters. So it would be something like that:

echo "  a couple of \t words ".split(sep = Whitespace)
# @[, , a, couple, of, , , words, ]
echo "  a couple of \t words ".split(ignore = Whitespace)
# @[a, couple, of, words]

2017-11-09 08:05:52
Well, I think it is possible to merge splitWhitespace into split like this:
  1. s.split() splits on whitespace (the way splitWhitespace does)
  2. all other forms of split work like they do now. To get the current default behaviour of split one should use s.split(Whitespace)

In other words:

echo "  a couple of \t words ".split()
# @[a, couple, of, words]
echo "  a couple of \t words ".split(Whitespace)
# @[, , a, couple, of, , , words, ]
This is easy to implement, but as far as I understand that would constitute a breaking change...

2017-11-09 20:43:43
Mine version would not. The old code can't use parameters non-existing then so you'll just have to add another split argument which works like splitWhitespace does today and then make splitWhitespace an alias for some split call (with depreciation annotation) for backwards compatibility.
2017-11-10 07:43:16
Maybe add a noEmpties: bool = false argument to split that returns the version without the empty strings (splitWhitespace version)
2017-12-05 22:40:20