Next: , Previous: Creating Strings, Up: Strings


5.2 Searching and Replacing

— Function File: deblank (s)

Remove trailing blanks and nulls from s. If s is a matrix, deblank trims each row to the length of longest string. If s is a cell array, operate recursively on each element of the cell array.

— Function File: findstr (s, t, overlap)

Return the vector of all positions in the longer of the two strings s and t where an occurrence of the shorter of the two starts. If the optional argument overlap is nonzero, the returned vector can include overlapping positions (this is the default). For example,

          findstr ("ababab", "a")
          => [ 1, 3, 5 ]
          findstr ("abababa", "aba", 0)
          => [ 1, 5 ]
     

— Function File: index (s, t)

Return the position of the first occurrence of the string t in the string s, or 0 if no occurrence is found. For example,

          index ("Teststring", "t")
          => 4
     

Caution: This function does not work for arrays of strings.

— Function File: rindex (s, t)

Return the position of the last occurrence of the string t in the string s, or 0 if no occurrence is found. For example,

          rindex ("Teststring", "t")
          => 6
     

Caution: This function does not work for arrays of strings.

— Function File: split (s, t, n)

Divides the string s into pieces separated by t, returning the result in a string array (padded with blanks to form a valid matrix). If the optional input n is supplied, split s into at most n different pieces.

For example,

          split ("Test string", "t")
          => "Tes "
                  " s  "
                  "ring"
     
          split ("Test string", "t", 2)
          => "Tes    "
                  " string"
     

— Function File: strcmp (s1, s2)

Return 1 if the character strings s1 and s2 are the same, and 0 otherwise. Caution: For compatibility with Matlab, Octave's strcmp function returns 1 if the strings are equal, and 0 otherwise. This is just the opposite of the corresponding C library function.

— Function File: strrep (s, x, y)

Replaces all occurrences of the substring x of the string s with the string y. For example,

          strrep ("This is a test string", "is", "&%$")
          => "Th&%$ &%$ a test string"
     

— Function File: substr (s, beg, len)

Return the substring of s which starts at character number beg and is len characters long.

If OFFSET is negative, extraction starts that far from the end of the string. If LEN is omitted, the substring extends to the end of S.

For example,

          substr ("This is a test string", 6, 9)
          => "is a test"
     
This function is patterned after AWK. You can get the same result by s (beg : (beg + len - 1)).

— Loadable Function: [s, e, te, m, t, nm] = regexp (str, pat)
— Loadable Function: [...] = regexp (str, pat, opts, ...)

Regular expression string matching. Matches pat in str and returns the position and matching substrings or empty values if there are none.

The matched pattern pat can include any of the standard regex operators, including:

.
Match any character
* + ? {}
Repetition operators, representing
*
Match zero or more times
+
Match one or more times
?
Match zero or one times
{}
Match range operator, which is of the form {n} to match exactly n times, {m,} to match m or more times, {m,n} to match between m and n times.

[...] [^...]
List operators, where for example [ab]c matches ac and bc
()
Grouping operator
|
Alternation operator. Match one of a choice of regular expressions. The alternatives must be delimited by the grouoing operator () above
^ $
Anchoring operator. ^ matches the start of the string str and $ the end

In addition the following escaped characters have special meaning. It should be noted that it is recommended to quote pat in single quotes rather than double quotes, to avoid the escape sequences being interpreted by octave before being passed to regexp.

\b
Match a word boundary
\B
Match within a word
\w
Matches any word character
\W
Matches any non word character
\<
Matches the beginning of a word
\>
Matches the end of a word
\s
Matches any whitespace character
\S
Matches any non whitespace character
\d
Matches any digit
\D
Matches any non-digit

The outputs of regexp by default are in the order as given below

s
The start indices of each of the matching substrings
e
The end indices of each matching substring
te
The extents of each of the matched token surrounded by (...) in pat.
m
A cell array of the text of each match.
t
A cell array of the text of each token matched.
nm
A structure containing the text of each matched named token, with the name being used as the fieldname. A named token is denoted as (?<name>...)

Particular output arguments or the order of the output arguments can be selected by additional opts arguments. These are strings and the correspondence between the output arguments and the optional argument are

'start' s
'end' e
'tokenExtents' te
'match' m
'tokens' t
'names' nm

A further optional argument is 'once', that limits the number of returned matches to the first match.

— Loadable Function: [s, e, te, m, t, nm] = regexpi (str, pat)
— Loadable Function: [...] = regexpi (str, pat, opts, ...)

Case insensitive regular expression string matching. Matches pat in str and returns the position and matching substrings or empty values if there are none. See regexp for more details