=pod =for stopwords lookup RequireBlockGrep =head1 NAME Perl::Critic::DEVELOPER - How to make new Perl::Critic::Policy modules. =head1 DESCRIPTION For developers who want to create custom coding standards, the following tells how to create a Policy module for L. Although the Perl::Critic distribution already includes a number of Policies based on Damian Conway's book I (which will be referred to via "I" from here on), Perl::Critic is not limited to his guidelines and can be used to enforce any practice, preference, or style that you want to follow. You can even write Policies to enforce contradictory guidelines. All you need to do is write a corresponding L subclass, which may require as little as 10 lines of code. =head1 BACKGROUND The heart of Perl::Critic is L, a parser and lexer for Perl. PPI transforms Perl source code into a Document Object Model (DOM). Each token in the document is represented by a PPI class, such as L or L, and then organized into structure classes, like L and L. The root node of the hierarchy is the L. The L engine traverses each node in the L tree and invokes each of the L subclasses at the appropriate node. The Policy can inspect the node, look at the surrounding nodes, and do whatever else it wants. If the Policy decides that a coding standard has been violated, it returns one or more L objects. If there are no violations, then the Policy returns nothing. Policies are usually written based on existing policies, so let's look at one to see how it works. The F Policy is relatively simple and demonstrates most of the important issues. The goal of this Policy is to enforce that every call to C uses a block for the first argument and not an expression. The reasons for this Policy are discussed in detail in I. =head1 EXAMPLE POLICY First, the Policy module needs to have a name. Perl::Critic uses L to automatically discover all modules in the C namespace. Also, we've adopted the convention of grouping Policies into directories according to the chapters of I. Since the goal of this Policy is to enforce the use of block arguments to C and it comes from the "Builtin Functions" chapter of I, we call it C<"Perl::Critic::Policy::BuiltinFunctions::RequireBlockGrep">. package Perl::Critic::Policy::BuiltinFunctions::RequireBlockGrep; Next, we set some pragmas and load the modules that we'll need. All Policy modules inherit from the L class, which provides no-op implementations of the basic methods. Our job is to override these methods to make them do something useful. Technically, C and C are optional, but we don't want Perl::Critic to be a hypocrite, now do we? use strict; use warnings; use Readonly; use Perl::Critic::Utils qw{ :severities :classification :ppi }; use parent 'Perl::Critic::Policy'; our $VERSION = '1.05'; Next, we'll declare a description and explanation for this Policy. The description is always just a string that basically says "this is what's wrong." The explanation can be either a string with further details, or a reference to an array of integers that correspond to page numbers in I. We make them read-only because they never change. (See L for why we don't C.) Readonly::Scalar my $DESC => q{Expression form of "grep"}; Readonly::Scalar my $EXPL => [ 169 ]; Most policies don't need to override the C method provided by L. However, if your Policy is configurable via F<.perlcriticrc>, you should implement a C method and need to implement C to examine the C<$config> values. Since this Policy isn't configurable, we'll declare that by providing an implementation of C that returns an empty list. sub supported_parameters { return () } Next, we define the C method, which must return an integer indicating the severity of violating this Policy. Severity values range from 1 to 5, where 5 is the "most severe." In general, level 5 is reserved for things that are frequently misused and/or cause bugs. Level 1 is for things that are highly subjective or purely cosmetic. The L package exports several severity constants that you can use here via the C<:severities> tag. sub default_severity { return $SEVERITY_HIGH } Likewise, the C method returns a list of theme names. Themes are intended to be named groups of Policies. All Policies that ship with Perl::Critic have a C<"core"> theme. Since use of C without blocks often leads to bugs, we include a C<"bugs"> theme. And since this Policy comes directly from I, this Policy should be a member of the C<"pbp"> theme. sub default_themes { return qw( core bugs pbp ) } As a Policy author, you can assign any themes you want to the Policy. If you're publishing a suite of custom Policies, we suggest that you create a unique theme that covers all the Policies in the distribution. That way, users can easily enable or disable all of your policies at once. For example, Policies in the L distribution all have a C<"more"> theme. Next, we indicate what elements of the code this Policy will analyze, like statements or variables or conditionals or POD. These elements are specified as PPI classes such as L, L, L or L respectively. The C method returns a list of PPI package names. (You can get that list of available package names via C.) As Perl::Critic traverses the document, it will call the C method from this module whenever it encounters one of the PPI types that are given here. In this case, we just want to test calls to C. Since the token "grep" is a L, we return that package name from the C method. sub applies_to { return 'PPI::Token::Word' } If your Policy needs to analyze several different types of elements, the C method may return the name of several PPI packages. If your Policy needs to examine the file as a whole, then the C method should return L. Since there is only one PPI::Document element, your Policy would only be invoked once per file. Now comes the interesting part. The C method does all the work. It is always called with 2 arguments: a reference to the current PPI element that Perl::Critic is traversing, and a reference to the entire PPI document. [And since this is an object method, there will be an additional argument that is a reference to this object (C<$self>), but you already knew that!] Since this Policy does not need access to the document as a whole, we ignore the last parameter by assigning to C. sub violates { my ( $self, $elem, undef ) = @_; The C method then often performs some tests to make sure we have the right "type" of element. In our example, we know that the element will be a L because that's what we declared back in the C method. However, we didn't specify exactly which "word" we were looking for. Evaluating a PPI element in a string context returns the literal form of the code. (You can also use the C method.) So we make sure that this C is, in fact, "grep". If it's not, then we don't need to bother examining it. return if $elem ne 'grep'; The C class is also used for barewords and methods called on object references. It is possible for someone to declare a bareword hash key as C<< %hash = ( grep => 'foo') >>. We don't want to test those types of elements because they don't represent function calls to C. So we use one of handy utility functions from L to make sure that this "grep" is actually in the right context. (The C subroutine is brought in via the C<:classification> tag.) return if ! is_function_call($elem); Now that we know this element is a call to the C function, we can look at the nearby elements to see what kind of arguments are being passed to it. In the following paragraphs, we discuss how to do this manually in order to explore L; after that, we'll show how this Policy actually uses facilities provided by L to get this done. Every PPI element is linked to its siblings, parent, and children (if it has any). Since those siblings could just be whitespace, we use the C to get the next code-sibling (the "s" in C stands for "significant"). my $sib = $elem->snext_sibling() or return; In Perl, the parenthesis around argument lists are usually optional, and PPI packs the elements into a L object when parentheses are used. So if the sibling is a C, we pull out the first (significant) child of that list. This child will be the first argument to C. If parentheses were not used, then the sibling itself is the first argument. my $arg = $sib->isa('PPI::Structure::List') ? $sib->schild(0) : $sib; In actuality, this sort of function argument lookup is common, so there is a L subroutine available via the C<:ppi> tag. So we use that instead. my $arg = first_arg($elem); Finally, we now have a reference to the first argument to C. If that argument is a block (i.e. something in curly braces), then it will be a L, in which case our Policy is satisfied and we just return nothing. return if !$arg; return if $arg->isa('PPI::Structure::Block'); But if it is not a L, then we know that this call to C must be using the expression form, and that violates our Policy. So we create and return a new L object via the L method, passing in the description, explanation, and a reference to the PPI element that caused the violation. And that's all there is to it! return $self->violation( $DESC, $EXPL, $elem ); } 1; One last thing -- people are going to need to understand what is wrong with the code when your Policy finds a problem. It isn't reasonable to include all the details in your violation description or explanation. So please include a DESCRIPTION section in the POD for your Policy. It should succinctly describe the behavior and motivation for your Policy and include a few examples of both good and bad code. Here's an example: =pod =head1 NAME Perl::Critic::Policy::BuiltinFunctions::RequireBlockGrep =head1 DESCRIPTION The expression forms of C and C are awkward and hard to read. Use the block forms instead. @matches = grep /pattern/, @list; #not ok @matches = grep { /pattern/ } @list; #ok @mapped = map transform($_), @list; #not ok @mapped = map { transform($_) } @list; #ok =cut When your policy has a section like this, users can invoke L with a C<--verbose> parameter of C<10> or C<11> or with a "%d" escape to see it along with the rest of the output for violations of your policy. =head1 MAKING YOUR POLICY CONFIGURABLE L takes care of gathering configuration information for your Policy, from whatever source the user specifies. (See L for the details of how a user specifies the values you're going to receive.) What your Policy ends up receiving for the value of a parameter is a string with leading and trailing whitespace removed. By default, you will need to handle conversion of that string to a useful form yourself. However, if you provide some metadata about your parameters, the parameter handling will be taken care of for you. (Additionally, tools that deal with Policies themselves can use this information to enhance their functionality. See the L C<--profile-proto> option for an example.) You can look at L for a simple example of a configurable Policy and L for a more complex one. =head2 Do It All Yourself The C method for a Policy receives one argument: an instance of L. This method is only called if the user's configuration has enabled the policy. It returns a boolean stating whether the Policy should continue to be enabled. Generally, the only reason to return C<$FALSE> is when some external requirement is missing. For example, L used to disable itself if L was not installed (that is until we made it no longer optional for the Perl-Critic distribution). A basic, do-nothing implementation of C would be: use Perl::Critic::Utils qw< :booleans >; ... sub initialize_if_enabled { my ( $self, $config ) = @_; return $TRUE; } As stated above, what you get in C<$config> are trimmed strings. For example, if the user's F<.perlcritic> contains [Your::Policy] foo = bar baz factor = 5.52 selections = 2 78 92 then C<$config> will contain the equivalent of my $config = { foo => 'bar baz', factor => '5.52', selections => '2 78 92', }; To make this available to the C method, the values are usually put into C<$self> under the name of the configuration item prefixed with an underscore. E.g. sub initialize_if_enabled { my ( $self, $config ) = @_; $self->{_foo} = $config->get{foo}; $self->{_factor} = $config->get{factor}; $self->{_selections} = $config->get{selections}; return $TRUE; } Often, you'll want to convert the configuration values into something more useful. In this example, C is supposed to be a list of integers. L contains a number of functions that can help you with this. Assuming that C wants to have C as an array, you'll want to have something like this: use Perl::Critic::Utils qw{ :booleans :characters :data_conversion }; sub initialize_if_enabled { my ( $self, $config ) = @_; $self->{_foo} = $config->get{foo}; $self->{_factor} = $config->get{factor}; my $selections = $config->get{selections} // $EMPTY_STRING; $self->{_selections} = [ words_from_string($selections) ]; return $TRUE; } Since C contains numbers, it may be desirable to change the assignment to look like $self->{_selections} = [ map { $_ + 0 } words_from_string($selections) ]; If C needs to quickly determine whether a particular value is in C, you would want to use a hash instead of an array, like this: $self->{_selections} = { hashify( words_from_string($selections) ) }; For an example of a Policy that has some simple, but non-standard configuration handling, see L. =head2 Note On Constructors It used to be the case that Policies handled configuration by implementing a constructor. However, there was no requirement to call the base constructor; as long as the Policy ended up being a blessed hash reference, everything was fine. Unfortunately, this meant that Policies would be loaded and their prerequisites would be Cd, even if the Policy wasn't enabled, slowing things down. Also, this severely restricted the core of L's ability to enhance things. Use of constructors is deprecated and is incompatible with C metadata below. Kindly use C, instead, to do any sort of set up that you need. =head2 Providing Basic Configuration Information Via C As minimum for a well behaved Policy, you should implement C in order to tell the rest of C what configuration values the Policy looks for, even if it is only to say that the Policy is not configurable. In the simple form, this function returns a list of the names of the parameters the Policy supports. So, for a non-configurable Policy, as in the C example above, this looked like sub supported_parameters { return () } For the example being used in the C section above, this would be sub supported_parameters { return qw< foo factor selections >; } Given this information, C can tell the user when they have specified a parameter for a Policy which isn't valid, e.g. when they've misspelled the name of the parameter, and can emit the parameter as part of a F<.perlcriticrc> prototype. You can provide even more information about your Policy's configuration by giving each parameter a description and a string representation of the default value for the parameter. You do this by having the values in the list returned by C be hash references instead of strings, with keys of C, C, and C. For example, sub supported_parameters { return ( { name => 'allowed_values', description => 'Individual and ranges of values to allow, and/or "all_integers".', default_string => '0 1 2', }, { name => 'allowed_types', description => 'Kind of literals to allow.', default_string => 'Float', }, ); } Note that use of constructors is L with specifying parameters in this way. =head2 Using C to Get It Done For You The C discussion above showed how you could help others with your Policy, but didn't do anything to make your life as a Policy author easier; you still need to implement C to access any configuration that the user has specified. To have the configuration automatically handled for you, you need to declare how your parameters act by specifying a value for their C. For example, the following declares that a parameter allows the user to choose from five specific values and that the user can select any combination of them: sub supported_parameters { return ( { name => 'allowed_types', description => 'Kind of literals to allow.', default_string => 'Float', behavior => 'enumeration', enumeration_values => [ qw{ Binary Exp Float Hex Octal } ], enumeration_allow_multiple_values => 1, }, ); } When you specify a behavior, parsing and validation of the user-specified and default values is done for you and your C method can retrieve the value under the key of the parameter name prefixed with an underscore, e.g., for the above declaration, the parsed and validated value can be accessed via C<< $self->{_allowed_types} >>. The behaviors provide additional functionality to C; for more on this, see L and L. The following discusses each of the supported behaviors and the options they support. For the full details of a behavior, see the documentation for the implementing class. =head3 "string" Implemented in L. The most basic of behaviors, the value of the parameter will be stored in the Policy as a string. This behavior is not configurable. =head4 C example sub supported_parameters { return ( { name => 'a_string', description => 'An example string.', default_string => 'blah blah blah', behavior => 'string', }, ); } =head4 Access example sub violates { my ($self, $element, $document) = @_; ... my $string = $self->{_a_string}; ... } =head3 "boolean" Implemented in L. The value of the parameter will be either L<$TRUE|Perl::Critic::Utils/$TRUE> or L<$FALSE|Perl::Critic::Utils/$FALSE>. This behavior is not configurable. =head4 C example sub supported_parameters { return ( { name => 'a_boolean', description => 'An example boolean.', default_string => '1', behavior => 'boolean', }, ); } =head4 Access example sub violates { my ($self, $element, $document) = @_; ... my $is_whatever = $self->{_a_boolean}; if ($is_whatever) { ... } ... } =head3 "integer" Implemented in L. The value is validated against C (with a special check for "0"). Notice that this means that underscores are allowed in input values as with Perl numeric literals. This takes two options, C and C, which specify endpoints of an inclusive range to restrict the value to. Either, neither, or both may be specified. =head4 C example sub supported_parameters { return ( { name => 'an_integer', description => 'An example integer.', default_string => '5', behavior => 'integer', integer_minimum => 0, integer_maximum => 10, }, ); } =head4 Access example sub violates { my ($self, $element, $document) = @_; ... my $integer = $self->{_an_integer}; if ($integer > $TURNING_POINT) { ... } ... } =head3 "string list" Implemented in L. The values will be derived by splitting the input string on blanks. (See L.) The parameter will be stored as a reference to a hash, with the values being the keys. This takes one optional option, C, of a reference to an array of strings that will always be included in the parameter value, e.g. if the value of this option is C<[ qw{ a b c } ]> and the user specifies a value of C<'c d e'>, then the value of the parameter will contain C<'a'>, C<'b'>, C<'c'>, C<'d'>, and C<'e'>. =head4 C example sub supported_parameters { return ( { name => 'a_string_list', description => 'An example list.', default_string => 'red pink blue', behavior => 'string list', list_always_present_values => [ qw{ green purple} ], }, ); } =head4 Access example sub violates { my ($self, $element, $document) = @_; ... my $list = $self->{_a_string_list}; my @list = keys %{$list}; ... return if not $list->{ $element->content() }; ... } =head3 "enumeration" Implemented in L. The values will be derived by splitting the input string on blanks. (See L.) Depending upon the value of the C option, the parameter will be stored as a string or a reference to a hash, with the values being the keys. This behavior takes one required option and one optional one. A value for C of a reference to an array of valid strings is required. A true value can be specified for C to allow the user to pick more than one value, but this defaults to false. =head4 C example use Perl::Critic::Utils qw{ :characters }; sub supported_parameters { return ( { name => 'a_single_valued_enumeration', description => 'An example enumeration that can only have a single value.', default_string => $EMPTY, behavior => 'enumeration', enumeration_values => [ qw{ block statement pod operator } ], enumeration_allow_multiple_values => 0, }, { name => 'a_multi_valued_enumeration', description => 'An example enumeration that can have multiple values.', default_string => 'fe', behavior => 'enumeration', enumeration_values => [ qw{ fe fi fo fum } ], enumeration_allow_multiple_values => 1, }, ); } =head4 Access example sub violates { my ($self, $element, $document) = @_; ... my $single_value = $self->{_a_single_valued_enumeration}; ... my $multi_value = $self->{_a_multi_valued_enumeration}; if ( $multi_value->{fum} ) { ... } ... } =head2 Using a Custom Parser If none of the behaviors does exactly what you want it to, you can provide your own parser for a parameter. The reason for doing this as opposed to using an implementation of C is that it allows you to use a behavior to provide its extra functionality and it provides a means for a C configuration program, e.g. an IDE that integrates C, to validate your parameter as the user modifies its value. The way you declare that you have a custom parser is to include a reference to it in the parameter specification with the C key. For example: sub supported_parameters { return ( { name => 'file_name', description => 'A file for to read a list of values from.', default_string => undef, behavior => 'string', parser => \&_parse_file_name, }, ); } A parser is a method on a subclass of L that takes two parameters: the L that is being specified and the value string provided by the user. The method is responsible for dealing with any default value and for saving the parsed value for later use by the C method. An example parser (without enough error handling) for the above example declaration: use Path::Tiny; use Perl::Critic::Exception::Configuration::Option::Policy::ParameterValue qw{ throw_policy_value }; sub _parse_file_name { my ($self, $parameter, $config_string) = @_; my @thingies; if ($config_string) { if (not -r $config_string) { throw_policy_value policy => $self->get_short_name(), option_name => $parameter->get_name(), option_value => $config_string, message_suffix => 'is not readable.'; } @thingies = path($config_string)->slurp; } $self->{_thingies} = \@thingies; return; } Note that, if the value for the parameter is not valid, an instance of L is thrown. This allows C to include that problem along with any other problems found with the user's configuration in a single error message. =head2 Using Both C and C There are cases where a Policy needs additional initialization beyond configuration or where the way it acts depends upon the combination of multiple parameters. In such situations, you will need to create an implementation of C. If you want to take advantage of the supplied parameter handling from within implementation of C, note that the information from C will already have been used, with user-supplied parameter values validated and placed into the Policy by the time C has been called. It is likely that you will not need to refer the contents of the C<$config> parameter; just pull the information you need out of C<$self>. In fact, any value for the parameter values will be gone. =head2 Summary of permitted hash keys in C. =head3 All types =over =item - "name" (mandatory) =item - "description" (optional) =item - "behavior" (optional) Currently, one of: =over =item "boolean" =item "enumeration" =item "integer" =item "string" =item "string list" =back =item - "default_string" (optional) A string representation of the default value of the parameter. =item - "parser" (optional) A code ref to a custom parser for the parameter. =back =head3 Enumerations =over =item - "enumeration_values" (mandatory) A mandatory reference to an array of strings. =item - "enumeration_allow_multiple_values" (optional) Boolean indicating whether or not the user is restricted to a single value. =back =head3 Integers =over =item - "integer_minimum" (optional) Minimum allowed value, inclusive. =item - "integer_maximum" (optional) Maximum allowed value, inclusive. =back =head3 String lists =over =item - "list_always_present_values" (optional) A reference to an array of values that should always be included in the value of the parameter. =back =head1 ADDITIONAL FEATURES =head2 C Certain problems that a Policy detects can be endemic to a particular file; if there's one violation, there's likely to be many. A good example of this is L; if there's one line before C, there's a good chance that the entire file is missing C. In such cases, it's not much help to the user to report every single violation. If you've got such a policy, you should override L method to provide a limit. The user can override this value with a value for "maximum_violations_per_document" in their F<.perlcriticrc>. See the source code for L and L for examples. =head2 C Most L Policies are purely I. In other words, they never compile or execute any of the source code that they analyze. However it is possible to write I Policies that do compile or execute code, which may result in unsafe operations (see L for an example). So the C method is used to indicate whether a Policy can be trusted to not cause mischief. By default, C returns true. But if you are writing a Policy that will compile or execute any of the source code that it analyzes, then you should override the C method to return false. =head1 DISTRIBUTING YOUR POLICIES =head2 Create a Distribution You need to come up with a name for your set of policies. Sets of add-on policies are generally named C>, e.g. L. The module representing the distribution will not actually have any functionality; it's just documentation and a name for users to use when installing via L/L. The important part is that this will include a list of the included policies, with descriptions of each. A typical implementation will look like: package Perl::Critic::Example; use strict; use warnings; our $VERSION = '1.000000'; 1; # Magic true value required at end of module __END__ =head1 NAME Perl::Critic::Example - Policies for Perl::Critic that act as an example. =head1 AFFILIATION This module has no functionality, but instead contains documentation for this distribution and acts as a means of pulling other modules into a bundle. All of the Policy modules contained herein will have an "AFFILIATION" section announcing their participation in this grouping. =head1 SYNOPSIS Some L policies that will help you keep your code nice and compliant. =head1 DESCRIPTION The included policies are: =over =item L Complains about some example documentation issues. [Default severity: 3] =item L All modules must have at least one variable. [Default severity: 3] =back =head1 CONFIGURATION AND ENVIRONMENT All policies included are in the "example" theme. See the L documentation for how to make use of this. =head2 Themes Users can choose which policies to enable using themes. You should implement C so that users can take advantage of this. In particular, you should use a theme named after your distribution in all your policies; this should match the value listed in the C POD section as shown above. default_themes { return qw< example math > } If you're looking for ideas of what themes to use, have a look at the output of C. =head2 Documentation =head3 AFFILIATION Since all policies have to go somewhere under the C namespace, it isn't always clear what distribution a policy came from when browsing through their documentation. For this reason, you should include an C section in the POD for all of your policies that state where the policy comes from. For example: =head1 AFFILIATION This policy is part of L. =head3 CONFIGURATION In order to make it clear what can be done with a policy, you should always include a C section in your POD, even if it's only to say: =head1 CONFIGURATION This Policy is not configurable except for the standard options. =head1 TESTING YOUR POLICY The L distribution also contains a framework for testing your Policy. See L for the details. =head1 HINT When you're trying to figure out what L is going to hand you for a chunk of code, there is a F program in the L distribution that will help you. For example, when developing the above RequireBlockGrep example, you might want to try tools/ppidump '@matches = grep /pattern/, @list;' and tools/ppidump '@matches = grep { /pattern/ } @list;' to see the differences between the two cases. Alternatively, see the C documentation at L and the C documentation at L. =head1 VERSION This is part of L version 1.116. =head1 SEE ALSO Chas. Owens has a blog post about developing in-house policies at L. =head1 AUTHOR Jeffrey Ryan Thalhammer =head1 COPYRIGHT Copyright (c) 2005-2011 Imaginative Software Systems. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. The full text of this license can be found in the LICENSE file included with this module. =cut ############################################################################## # Local Variables: # mode: cperl # cperl-indent-level: 4 # fill-column: 78 # indent-tabs-mode: nil # c-indentation-style: bsd # End: # ex: set ts=8 sts=4 sw=4 tw=70 ft=pod expandtab shiftround :