SPLIT

The function SPLIT takes two strings, base_string and separator, and returns a list of strings in which each item is a fragment of the string making up base_string, cleaved at instances where separator appeared.

This function takes two strings as input: the first is the base string, and the second is the separator. It then splits the base string at each instance where the separator appears and outputs the result as a list of strings. This list will not include the separator within any of the strings that make up its items.

Declaration

SPLIT(base_string, separator) -> list_of_strings

 

Parameters

base_string (required, type: string)
The string to split.

separator (required, type: string)
The string that designates which subcomponents within the string base_string will be cleaved, leaving the remaining string fragments to be outputted as items in a list.

Return Values

list_of_strings (type: list)
The list of strings remaining after taking the base string and cleaving it at instances where the separator string appeared.

Examples

The following example takes the string "This is the string to split", splits it at each instance of a space (" "), and outputs the resulting fragments as strings in list. Note that none of the strings in this list contain any spaces.

SPLIT("This is the string to split", " ") -> ["This", "is", "the", "string", "to", "split"]

One notable feature of the SPLIT function is how it behaves if the base string and the separator are the same string. The following example takes the string "This is the string to split" and splits it at each instance of "This is the string to split" – the entire base string. Removing the entire base string leaves only blanks at what was the beginning and end of the string, and this results in a list with two items, both of them blank strings:

SPLIT("This is the string to split", "This is the string to split") -> ["", ""]

The next example demonstrates what happens if the separator is longer than the base string. It takes the string "This is the string to split" and attempts to split it at each instance of "This is the string to split and more". There is no way for such instance to exist when the separator is longer than the base string, and so the output is a list with only a single item, the complete base string: 

SPLIT("This is the string to split", "This is the string to split and more") -> ["This is the string to split"]

In the above example, the output is a list containing only a single item: the base string in its entirety. This is the output SPLIT will return any time there are no instances of the separator within the base string. The same thing happens when following example takes the string "This is the string to split" and attempts to split it at each instance of "coconuts". Nowhere in "This is the string to split" does the string "coconuts" appear, so the output is again a list containing only the full base string:

SPLIT("This is the string to split", "coconuts") -> ["This is the string to split"]

A separator is required in order for the SPLIT function to run, but the string that makes up the separator does not need to contain any characters. The following example takes the string "This is a string" and separates it at each instance of "", which occurs between each character in the base string:

SPLIT("This is a string", "") -> ["T", "h", "i", "s", " ", "i", "s", " ", "a", " ", "s", "t", "r", "i", "n", "g" ]

The input that goes into the SPLIT function doesn't need to be hardcoded. Assume the last example has access to the following variables:

long_string = "We want to split this long string in sort of a strange way."  
short_string = "t t"

The following example takes the string value of long_string, splits it at each instance of short_string, and outputs the resulting fragments as strings in list.

SPLIT(long_string, short_string) -> ["We wan", "o spli", "his long string in sort of a strange way."]

 

Discussion

As a rule of thumb, when you call on the SPLIT function, you most likely want to input a separator that is shorter than the base_string, though, as demonstrated above, this is not technically a requirement. If you input a separator that is longer than base_string, you will get valid output: a list containing a single item: base_string, entirely unmodified. While this may be unavoidable if looping through several variables as input, if the goal is only to put base_string into a list, a list literal is a more efficient way to do so. That is, creating a list literal [base_string] will have the same effect and take less computing time.

 

SPLIT can be thought of as the opposite of the JOIN function. That is, for any two strings, let's call them big_string and little_string, you can use JOIN to undo what SPLIT did, like so:

JOIN(SPLIT(big_string, little_string), little_string) -> big_string

In other words, SPLIT takes a string with (potential) instances of a separator between them, cleaves the separator, and returns the remaining, separated strings as a list, while JOIN takes a list of strings and combines them into a single string with a separator string marking where each combination occurred.

To walk through this step by step, consider the following example, where big_string is "This is a string" and little_string is " ":

JOIN(SPLIT("This is a string", " "), " ")

The inner function, SPLIT, takes the string "This is a string," cleaves out the spaces, and outputs the remaining fragments as strings in a list:

(SPLIT("This is a string", " ") - ["This," "is", "a", "string"]

The outer function then takes the output of this inner function and combines all the strings in the list, a space now between each of them – " " being the string we just cleaved from the original, longer string – into a single string. Because the same string (" ") was used as the second input value for both functions, the output matches the original input list exactly. In other words, JOIN reinserted the spaces that SPLIT just removed: 

JOIN(["This," "is", "a", "string"], " ") -> "This is a string"