String split() Java

String split() Java

Java designers noticed the importance of split() method in a programming language (which already exists in other languages like JavaScript etc.) and introduced in Java from JDK 1.4. The same splitting of a string into independent words or tokens is done in earlier versions using StringTokenizer. Usage of split() in place of StringTokenizer is very easy.

Following example uses String split() in various combinations.

String split()
Output screen on String split() Java

Before going for explanation of the above code, first let us see the signature of the split() method as defined in String class from JDK 1.4.

The split() method overloaded two times.

public String[] split(String regex): Splits this string around matches of the given regular expression. This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.

public String[] split(String regex, int limit): Splits this string around matches of the given regular expression.

The split() method takes a regular expression as parameter.


The split() method returns a string array. To print the elements of string array, the easiest way to use toString() method of java.util.Array class.

// to get correct output with extra empty spaces

String str11 = “abc def ghi”; // string with extra spaces of 4 between def and ghi
String str11Array[] = str11.split(“\\s+”); // prints [abc, def, ghi]

To ignore the extra white spaces (other than one), the regular expression \\s+ is used (remember, split() takes a regular expression as parameter).

// should be coded as
String str5 = “abc.def.ghi”; // . is a special character
String str5Array[] = str5.split(“\\.”); // prints [abc, def, ghi]

String str7 = “abc|def|ghi”; // | is a special character
String str7Array[] = str7.split(“\\|”); // prints [abc, def, ghi]

With the special characters existing in the string, a little care should be taken (special characters and keywords give a special meaning to the compiler). The special character . (dot) should be preceded by a single backward \. But again backward \ is a special character write two backslahes as \\. Infact, \\ is an escape sequence.

The braces { and } are special also special characters and should be written as follows.

// string with letters enclosed within { and }
String str8 = “{abc}{def}{ghi}”;
String str8Array[] = str8.split(“[{}]”); // prints [, abc, , def, , ghi] // { and } are a special characters

If you have long string and is required to split a group of characters, use the regular expression as follows.

// a long string split into groups of 4 characters
String str9 = “abcdefghijklmnopqrstu”;
String str9Array[] = str9.split(“(?<=\\G.{4})"); // prints [abcd, efgh, ijkl, mnop, qrst, u]

In the above code the string is split with a group of 4 characters.

// a split including a special character
String str10 = “abc*def*ghi*”;
String str10Array[] = str10.split(“(?<=[*])"); // prints [abc*, def*, ghi*]

In the above code, in the output tokens, the delimiter * is also included. Actually, the delimiter is not printed in the output.