The re
module is a wrapper around boost::regex, intended as a full
replacement for Lua's built in regular expressions. It has two main advantages
over Lua's:
Import this module with re = require 'aegisub.re'
.
See boost.regex's documentation for information about the regular expression syntax. In general any resources on the web that refer to Perl regular expressions or PCRE will apply to this module's regular expressions.
Several of the functions below return Match Tables, which are tables containing the following fields:
str
(string
)first
(number
)str
in the original string which had a regular
expression applied to it. Note that this index is one-based and is in bytes,
rather than characters, to match Lua's string indexing.last
(number
)str
in the original string which had a regular expression
applied to it. Note that this index is one-based, inclusive, and is in bytes,
rather than characters, to match Lua's string indexing.>>> re.match("b", "abc")
{
{
["str"] = "b",
["first"] = 2,
["last"] = 2
}
}
The following flags may be passed to all of the static functions (including
re.compile
). Flags must come after all supplied non-flag arguments, but
optional arguments can be skipped.
>>> re.match("a", "A")
nil
>>> re.match("a", "A", re.ICASE, re.NOSUB)
{
{
["str"] = "A",
["first"] = 1,
["last"] = 1
}
}
Synopsis: expr = re.compile(pattern, [FLAGS])
Compile a regular expression. Reusing a compiled regular expression is faster than recompiling it each time it is used, and is usually more readable as well.
@pattern
(string
)expr
(table
)>>> expr = re.compile("a")
>>> expr:split("banana")
{
"b",
"n",
"n"
}
Synopsis: chunks = re.split(str, pattern, skip_empty=false, max_splits=0)
Split the string at each of the occurrences of pattern
.
@str
(string
)@pattern
(string
)@skip_empty
(boolean
)@max_splits
(number
)#chunks
will be at most max_splits + 1
).chunks
(table
)str
between the matches of
pattern
.>>> re.split(",", "a,,b,c")
{
"a",
"",
"b",
"c"
}
>>> re.split(",", "a,,b,c", true)
{
"a",
"b",
"c"
}
>>> re.split(",", "a,,b,c", false, 1)
{
"a",
",b,c",
}
Synopsis: iter = re.gsplit(str, pattern, skip_empty=false, max_splits=0)
Iterator version of re.split.
@str
(string
)@pattern
(string
)@skip_empty
(boolean
)@max_splits
(number
)#chunks
will be at most max_splits + 1
).iter
(iterator over strings
)str
between the matches of
pattern
.>>> for str in re.gsplit(",", "a,,b,c") do
>>> print str
>>> end
a
b
c
>>> for str in re.gsplit(",", "a,,b,c", true) do
>>> print str
>>> end
a
b
c
>>> for str in re.gsplit(",", "a,,b,c", false, 1) do
>>> print str
>>> end
a
,b,c
Synopsis: matches = re.find(str, pattern)
Find all non-overlapping substrings of str
which match pattern
.
@str
(string
)@pattern
(string
)matches
(table
or nil
)nil
if
there were none.>>> re.find(".", "☃☃")
{
{
["str"] = "☃",
["first"] = 1,
["last"] = 3
},
{
["str"] = "☃",
["first"] = 4,
["last"] = 6
}
}
function contains_an_a(str)
if re.find("a", str)
print "Has an a"
else
print "Doesn't have an a"
end
end
>>> contains_an_a("abc")
Has an a
>>> contains_an_a("def")
Doesn't have an a
Synopsis: iter = re.gfind(str, pattern)
Iterate over all non-overlapping substrings of str
which match pattern
.
@str
(string
)@pattern
(string
)iter
(iterator over string, number, number
)>>> for str, start_idx, end_idx in re.gfind(".", "☃☃") do
>>> print string.format("%d-%d: %s", start_idx, end_idx, str)
>>> end
1-3: ☃
4-6: ☃
Synopsis: matches = re.match(str, pattern)
Match a pattern against a string. This differs from find
in that find
returns all matches and does not capture subgroups, while this returns only a
single match along with the captured subgroups.
@str
(string
)@pattern
(string
)matches
(table
or nil
)nil
if the pattern did not match the string. Otherwise, a table containing
a Match Table for the full match, followed by a Match
Table for each capturing subexpression in the pattern (if
any).>>> re.match("(\d+) (\d+) (\d+)", "{250 1173 380}Help!")
{
{
["str"] = "250 1173 380",
["first"] = 2,
["last"] = 13
},
{
["str"] = "250",
["first"] = 2,
["last"] = 4
},
{
["str"] = "1173",
["first"] = 6,
["last"] = 9,
},
{
["str"] = "380"
["first"] = 11,
["last"] = 13
}
}
Synopsis: iter = re.gmatch(str, pattern)
Iterator version of re.match
.
@str
(string
)@pattern
(string
)matches
(iterator over table
)Synopsis: out_str, rep_count = re.sub(str, replace, pattern, max_count=0)
Replace each occurrence of pattern
in str
with replace
.
@pattern
(string
)@replace
(string
or function
)Replacement for matches. This may be either a string which is inserted, or a function which is called for each match.
If replace
is a string, it may contain references to the matches. &
and
\0
are replaced with the full match, and \<number>
is replaced with the
appropriate captured subexpression.
If replace
is a function, it is called for either the entire match (if
there are no capturing subexpressions), or for each captured subexpression.
It is passed the match string, start index of the match, and end index of
the match. If it returns a string, the match is replaced with the return
value. If it returns anything else, then the source string is left
unchanged.
@max_count
(number
)out_str
(string
)rep_count
(number
)Replace all instances of \k with \kf:
>>> re.sub("{\\k10}a{\\k15}b{\\k30}c", "\\\\k", "\\kf")
{\kf10}a{\kf15}b{\kf30}c
Replace all instances of \k and \K with \kf:
>>> re.sub("{\\K10}a{\\K15}b{\\k30}c", "\\\\k", "\\kf", re.ICASE)
{\kf10}a{\kf15}b{\kf30}c
Add one to each \k duration:
function add_one(str)
return tostring(tonumber(str) + 1)
end
>>> re.sub("{\\k10}a{\\k15}b{\\k30}c", "\\\\k(\[[:digit:]]+)", add_one)
{\k11}a{\k16}b{\k31}c
Overview: | Automation Manager • Running macros • Using export filters • Standard macros |
---|---|
Karaoke Templater reference: | Declaring
templates • Execution
order • Modifiers • Inline-variables ($-variables)
Code lines and blocks • Execution envirionment |
Lua API reference: | Registration • Subtitles object • Progress reporting • Dialogs • Misc. APIs |
Lua Modules: | karaskel.lua • util • unicode • cleantags.lua • clipboard • re |
Karaskel concepts: | Style tables • Dialogue line tables • Syllable tables • Inline effects • Furigana |