CONTENTS

NAME

PPIx::Regexp::Token::Modifier - Represent modifiers.

SYNOPSIS

use PPIx::Regexp::Dumper;
PPIx::Regexp::Dumper->new( 'qr{foo}smx' )
    ->print();

The trailing smx will be represented by this class.

This class also represents the whole of things like (?ismx). But the modifiers in something like (?i:foo) are represented by a PPIx::Regexp::Token::GroupType::Modifier.

INHERITANCE

PPIx::Regexp::Token::Modifier is a PPIx::Regexp::Token.

PPIx::Regexp::Token::Modifier is the parent of PPIx::Regexp::Token::GroupType::Modifier.

DESCRIPTION

This class represents modifier characters at the end of the regular expression. For example, in qr{foo}smx this class would represent the terminal smx.

The a, aa, d, l, and u modifiers

The a, aa, d, l, and u modifiers, introduced starting in Perl 5.13.6, are used to force either Unicode pattern semantics (u), locale semantics (l) default semantics (d the traditional Perl semantics, which can also mean 'dual' since it means Unicode if the string's UTF-8 bit is on, and locale if the UTF-8 bit is off), or restricted default semantics (a). These are mutually exclusive, and only one can be asserted at a time. Asserting any of these overrides the inherited value of any of the others. The asserted() method reports as asserted the last one it sees, or none of them if it has seen none.

For example, given PPIx::Regexp::Token::Modifier $elem representing the invalid regular expression fragment (?dul), $elem->asserted( 'l' ) would return true, but $elem->asserted( 'u' ) would return false. Note that $elem->negated( 'u' ) would also return false, since u is not explicitly negated.

If $elem represented regular expression fragment (?i), $elem->asserted( 'd' ) would return false, since even though d represents the default behavior it is not explicitly asserted.

The caret (^) modifier

Calling ^ a modifier is a bit of a misnomer. The (?^...) construction was introduced in Perl 5.13.6, to prevent the inheritance of modifiers. The documentation calls the caret a shorthand equivalent for d-imsx, and that it the way this class handles it.

For example, given PPIx::Regexp::Token::Modifier $elem representing regular expression fragment (?^i), $elem->asserts( 'd' ) would return true, since in the absence of an explicit l or u this class considers the ^ to explicitly assert d.

The caret handling is complicated by the fact that the 'n' modifier was introduced in 5.21.8, at which point the caret became equivalent to d-imnsx. I did not feel I could unconditionally add the -n to the expansion of the caret, because that would produce confusing output from methods like explain(). Nor could I make it conditional on the minimum perl version, because that information is not available early enough in the parse. What I did was to expand the caret into d-imnsx if and only if 'n' was in effect at some point in the scope in which the modifier was parsed.

Continuing the above example, $elem->asserts( 'n' ) and $elem->modifier_asserted( 'n' ) would both return false, but $elem->negates( 'n' ) would return true if and only if the /m modifier has been asserted somewhere before and in-scope from this token. The modifier_asserted( 'n' ) method is inherited from PPIx::Regexp::Element.

METHODS

This class provides the following public methods. Methods not documented here are private, and unsupported in the sense that the author reserves the right to change or remove them without notice.

asserts

$token->asserts( 'i' ) and print "token asserts i";
foreach ( $token->asserts() ) { print "token asserts $_\n" }

This method returns true if the token explicitly asserts the given modifier. The example would return true for the modifier in (?i:foo), but false for (?-i:foo).

Starting with version 0.036_01, if the argument is a single-character modifier followed by an asterisk (intended as a wild card character), the return is the number of times that modifier appears. In this case an exception will be thrown if you specify a multi-character modifier (e.g. 'ee*').

If called without an argument, or with an undef argument, all modifiers explicitly asserted by this token are returned.

match_semantics

my $sem = $token->match_semantics();
defined $sem or $sem = 'undefined';
print "This token has $sem match semantics\n";

This method returns the match semantics asserted by the token, as one of the strings 'a', 'aa', 'd', 'l', or 'u'. If no explicit match semantics are asserted, this method returns undef.

modifiers

my %mods = $token->modifiers();

Returns all modifiers asserted or negated by this token, and the values set (true for asserted, false for negated). If called in scalar context, returns a reference to a hash containing the values.

negates

$token->negates( 'i' ) and print "token negates i\n";
foreach ( $token->negates() ) { print "token negates $_\n" }

This method returns true if the token explicitly negates the given modifier. The example would return true for the modifier in (?-i:foo), but false for (?i:foo).

If called without an argument, or with an undef argument, all modifiers explicitly negated by this token are returned.

SUPPORT

Support is by the author. Please file bug reports at https://rt.cpan.org/Public/Dist/Display.html?Name=PPIx-Regexp, https://github.com/trwyant/perl-PPIx-Regexp/issues, or in electronic mail to the author.

AUTHOR

Thomas R. Wyant, III wyant at cpan dot org

COPYRIGHT AND LICENSE

Copyright (C) 2009-2023 by Thomas R. Wyant, III

This program is free software; you can redistribute it and/or modify it under the same terms as Perl 5.10.0. For more details, see the full text of the licenses in the directory LICENSES.

This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose.