=encoding utf8 =head1 Title Synopsis 29: Builtin Functions =head1 Version Author: Rod Adams Maintainer: Larry Wall Contributions: Aaron Sherman Mark Stosberg Date: 12 Mar 2005 Last Modified: 31 Jan 2008 Version: 20 This document attempts to document the list of builtin functions in Perl 6. It assumes familiarity with Perl 5 and prior synopses. The document is now the official S29. It's still here in the pugs repository temporarily to allow easy access to pugs implementors, but eventually it will be copied over to svn.perl.org. Despite its being "official", feel free to hack on it as long as it's in the pugs space. -law This document is generated from the pod in the pugs repository under /docs/Perl6/Spec/Functions.pod so edit it there in the SVN repository if you would like to make changes. =head1 Notes In Perl 6, all builtin functions belong to a named package (generally a class or role). Not all functions are guaranteed to be imported into the global package C<::*>. In addition, the list of functions imported into C<::*> will be subject to change with each release of Perl. Authors wishing to "Future Proof" their code should either specifically import the functions they will be using, or always refer to the functions by their full name. After 6.0.0 comes out, global aliases will not be removed lightly, and will never be removed at all without having gone through a deprecation cycle of at least a year. In any event, you can specify that you want the interface for a particular version of Perl, and that can be emulated by later versions of Perl to the extent that security updates allow. Where code is given here, it is intended to define semantics, not to dictate implementation. =head2 Operators vs. Functions There is no particular difference between an operator and a function, but for the sake of documentation, only functions declared without specifying a grammatical category or with a category of C (see L) will be described as "functions", and everything else as "operators" which are outside of the scope of this document. =head2 Multis vs. Functions In actual fact, most of the "functions" defined here are multi subs, or are multi methods that are also exported as multi subs. Multi subs are all visible in the global namespace (unless declared with a "my multi"). The assumption is that with sufficiently specific typing on the multis, the user is free to extend a particular name to new types. =head1 Type Declarations The following type declarations are assumed: =over =item AnyChar The root class of all "character" types, regardless of level. This is a subtype of C, limited to a length of 1 at it's highest supported Unicode level. The type name C is aliased to the maximum supported Unicode level in the current lexical scope (where "current" is taken to mean the eventual lexical scope for generic code (roles and macros), not the scope in which the generic code is defined). In other words, use C when you don't care which level you're writing for. Subclasses (things that are C): =over =item CharLingua (language-defined characters) =item Grapheme (language-independent graphemes) =item Codepoint =item Byte Yes, Byte is both a string and a number. =back The short name for C is typically C since that's the default Unicode level. A grapheme is defined as a base codepoint plus any subsequent "combining" codepoints that apply to that base codepoint. Graphemes are always assigned a unique integer id which, in the case of a grapheme that has a precomposed codepoint, happens to be the same as that codepoint. There is no short name for C because the type is meaningless outside the scope of a particular language declaration. In fact, C is itself an abstract type that cannot be instantiated. Instead you have names like C, C, C, etc. for instantiated C types. (Plus the corresponding C types, presumably.) =item Matcher subset Matcher of Item | Junction; Used to supply a test to match against. Assume C<~~> will be used against it. =item Ordering subset KeyExtractor of Code where { .sig === :(Any --> Any) }; subset Comparator of Code where { .sig === :(Any, Any --> Int ) }; subset OrderingPair of Pair where { .left ~~ KeyExtractor && .right ~~ Comparator }; subset Ordering where Signature | KeyExtractor | Comparator | OrderingPair; Used to handle comparisons between things. Generally this ends up in functions like C, C, C, C, C, etc., as a $by parameter which provides the information on how two things compare relative to each other. Note that C and C do almost but not the same thing since with C you don't care if two things are ordered increasing or decreasing but only if they are the same or not. Rather than declare an C type declaration C will just do double duty. =over =item Comparator A closure with arity of 2, which for ordering returns negative/zero/positive, signaling the first argument should be before/tied with/after the second. aka "The Perl 5 way". For equivalence the closure returns either not 0 or 0 indicating if the first argument is equivalent or not to the second. =item KeyExtractor A closure with arity of 1, which returns the "key" by which to compare. Values are compared using C for orderings and C for equivalences, which in Perl 6 do different comparisons depending on the types. (To get a Perl 5 string ordering you must compare with C instead.) Internally the result of the KeyExtractor on a value should be cached. =item OrderingPair A combination of the two methods above, for when one wishes to take advantage of the internal caching of keys that is expected to happen, but wishes to compare them with something other than C or C, such as C=E> or C. =item Signature If a signature is specified as a criterion, the signature is bound to each value and then each parameter does comparisons in positional order according to its type, as modified by its traits. Basically, the system will write the body of the key extraction and comparison subroutine for you based on the signature. For ordering the list of positional parameter comparisons is reduced as if using [||] but all comparisons do not need to be performed if an early one determines an increasing or decreasing order. For equivalence the list is reduced as if using [&&]. =back =back =head1 Function Packages =head2 Any The following are defined in the C role: =over =item eqv our Bool multi sub eqv (Ordering @by, $a, $b) our Bool multi sub eqv (Ordering $by = &infix:, $a, $b) Returns a Bool indicating if the parameters are equivalent, using criteria C<$by> or C<@by> for comparisons. C<@by> differs from C<$by> in that each criterion is applied, in order, until a non-zero (equivalent) result is achieved. =item cmp our Order multi sub cmp (Ordering @by, $a, $b) our Order multi sub cmp (Ordering $by = &infix:, $a, $b) Returns C, or C, or C (which numify to -1, 0, +1 respectively) indicating if paramater C<$a> should be ordered before/tied with/after parameter C<$b>, using criteria C<$by> or C<@by> for comparisons. C<@by> differs from C<$by> in that each criterion is applied, in order, until a non-zero (tie) result is achieved. If the values are not comparable, returns a proto C object that is undefined. =back =head2 Num The following are all defined in the C role: B: L C provides a number of constants in addition to the basic mathematical functions. To get these constants, you must request them: use Num :constants; or use the full name, e.g. C. =over =item abs our Num multi method abs ( Num $x: ) is export Absolute Value. =item floor our Int multi method floor ( Num $x: ) is export Returns the highest integer not greater than C<$x>. =item ceiling our Int multi method ceil ( Num $x: ) is export Returns the lowest integer not less than C<$x>. =item round our Int multi method round ( Num $x: ) is export Returns the nearest integer to C<$x>. The algorithm is C. (Other rounding algorithms will be given extended names beginning with "round".) =item truncate our Int multi method truncate ( Num $x: ) is export our Int multi method int ( Num $x: ) is export Returns the closest integer to C<$x> whose absolute value is not greater than the absolute value of C<$x>. (In other words, just chuck any fractional part.) This is the default rounding function used by an C cast, for historic reasons. But see Int constructor above for a rounded version. =item exp our Num multi method exp ( Num $exponent: Num :$base = Num::e ) is export Performs similar to C<$base ** $exponent>. C<$base> defaults to the constant I. =item log our Num multi method log ( Num $x: Num :$base = Num::e ) is export Logarithm of base C<$base>, default Natural. Calling with C<$x == 0> is an error. =item log10 our Num multi method log10 (Num $x:) is export A base C<10> logarithm, othewise identical to C. =item rand our Num method rand ( Num $x: ) our Num term: Pseudo random number in range C<< 0 ..^ $x >>. That is, C<0> is theoretically possible, while C<$x> is not. The C function is 0-ary and always produces a number from C<0..^1>. In any case, for picking a random integer you probably want to use something like C<(1..6).pick> instead. =item sign our Int multi method sign ( Num $x: ) is export Returns 1 when C<$x> is greater than 0, -1 when it is less than 0, 0 when it is equal to 0, or undefined when the value passed is undefined. =item srand multi method srand ( Num $seed: ) multi srand ( Num $seed = default_seed_algorithm()) Seed the generator C uses. C<$seed> defaults to some combination of various platform dependent characteristics to yield a non-deterministic seed. Note that you get one C for free when you start a Perl program, so you I call C yourself if you wish to specify a deterministic seed (or if you wish to be differently nondeterministic). =item sqrt our Num multi method sqrt ( Num $x: ) is export Returns the square root of the parameter. =item roots (in Num) method roots (Num $x: Int $n --> List of Num) is export Returns a list of all C<$n>th (complex) roots of C<$x> =item cis our Complex multi method cis (Num $angle:) is export Returns 1.unpolar($angle) =item unpolar our Complex multi method unpolar (Num $mag: Num $angle) is export Returns a complex number specified in polar coordinates. Angle is in radians. =back =head2 Complex our Seq multi method polar (Complex: $nim) is export Returns (magnitude, angle) corresponding to the complex number. The magnitude is non-negative, and the angle in the range C<-π ..^ π>. =head2 The :Trig tag The following are also defined in C but not exported without a C<:Trig> tag. (Which installs their names into C, as it happens.) =over 4 =item I Num multi method func ( Num $x: $base = 'radians' ) is export(:Trig) where I is one of: sin, cos, tan, asin, acos, atan, sec, cosec, cotan, asec, acosec, acotan, sinh, cosh, tanh, asinh, acosh, atanh, sech, cosech, cotanh, asech, acosech, acotanh. Performs the various trigonometric functions. Option C<$base> is used to declare how you measure your angles. Given the value of an arc representing a single full revolution. $base Result ---- ------- /:i ^r/ Radians (2*pi) /:i ^d/ Degrees (360) /:i ^g/ Gradians (400) Num Units of 1 revolution. Note that module currying can be used within a lexical scope to specify a consistent base so you don't have to supply it with every call: my module Trig ::= Num::Trig.assuming(:base); This overrides the default of "radians". =item atan2 our Num multi method atan2 ( Num $y: Num $y = 1 ) our Num multi atan2 ( Num $y, Num $x = 1 ) This second form of C computes the arctangent of C<$y/$x>, and takes the quadrant into account. Otherwise behaves as other trigonometric functions. =back =head2 Scalar B: L C provides the basic tools for operating on undifferentiated scalar variables. All of the following are exported by default. =over =item defined our Bool multi defined ( Any $thing ) our Bool multi defined ( Any $thing, ::role ) C returns true if the parameter has a value and that value is not the undefined value (per C), otherwise false is returned. Same as Perl 5, only takes extra optional argument to ask if value is defined with respect to a particular role: defined($x, SomeRole); A value may be defined according to one role and undefined according to another. Without the extra argument, defaults to the definition of defined supplied by the type of the object. =item undefine our multi undefine( Any $thing ) Takes any variable as a parameter and attempts to "remove" its definition. For simple scalar variables this means assigning the undefined value to the variable. For objects, this is equivalent to invoking their undefine method. For arrays, hashes and other complex data, this might require emptying the structures associated with the object. In all cases, calling C on a variable should place the object in the same state as if it was just declared. =item undef constant Scalar Scalar::undef Returns the undefined scalar object. C has no value at all, but for historical compatibility, it will numify to C<0> and stringify to the empty string, potentially generating a warning in doing so. There are two ways to determine if a value equal to undef: the C function (or method) can be called or the C (or C) operator can be used. C is also considered to be false in a boolean context. Such a conversion does not generate a warning. Perl 5's unary C function is renamed C to avoid confusion with the value C (which is always 0-ary now). =back =head2 Container =over =item cat our Cat multi cat( *@@list ) C reads arrays serially rather than in parallel as C does. It returns all of the elements of the containers that were passed to it like so: cat(@a;@b;@c); Typically, you could just write C<(@a,@b,@c)>, but sometimes it's nice to be explicit about that: @foo := [[1,2,3],[4,5,6]]; say cat([;] @foo); # 1,2,3,4,5,6 In addition, a C in item context emulates the C interface lazily. =item roundrobin our List multi roundrobin( *@@list ) C is very similar to C. The difference is that C will not stop on lists that run out of elements but simply skip any undefined value: my @a = 1; my @b = 1..2; my @c = 1..3; for roundrobin( @a; @b; @c ) -> $x { ... } will get the following values for C<$x>: C<1, 1, 1, 2, 2, 3> =item zip our List of Capture multi zip ( *@@list ) our List of Capture multi infix: ( *@@list ) zip takes any number of arrays and returns one tuple for every index. This is easier to read in an example: for zip(@a;@b;@c) -> $nth_a, $nth_b, $nth_c { ... } Mnemonic: the input arrays are "zipped" up like a zipper. The C function defaults to stopping as soon as any of its lists is exhausted. This behavior may be modified by conceptually extending any short list using C<*>, which replicates the final element. If all lists are potentially infinite, an evaluation in C context will automatically fail as soon as it can be known that all sublists in the control of iterators of infinite extent, such as indefinite ranges or arbitrary replication. If it can be known at compile time, a compile-time error results. C is an infix equivalent for zip: for @a Z @b Z @c -> $a, $b, $c {...} In C<@@> context a List of Array is returned instead of flat list. =back =head2 Array All these methods are defined in the C role/class. =over =item shape our Capture method shape (@array: ) is export Returns the declared shape of the array, as described in S09. =item end our Any method end (@array: ) is export Returns the final subscript of the first dimension; for a one-dimensional array this simply the index of the final element. For fixed dimensions this is the declared maximum subscript. For non-fixed dimensions (undeclared or explicitly declared with C<*>), the actual last element is used. =item elems our Int method elems (@array: ) is export Returns the length of the array counted in elements. (Sparse array types should return the actual number of elements, not the distance between the maximum and minimum elements.) =item delete our List multi method delete (@array : *@indices ) is export Sets elements specified by C<@indices> in the invocant to a non-existent state, as if they never had a value. Deleted elements at the end of an Array shorten the length of the Array, unless doing so would violate an C definition. C<@indices> is interpreted the same way as subscripting is in terms of slices and multidimensionality. See Synopsis 9 for details. Returns the value(s) previously held in deleted locations. An unary form is expected. See C. =item exists our Bool multi method exists (@array : Int *@indices ) is export True if the specified Array element has been assigned to. This is not the same as being defined. Supplying a different number of indices than invocant has dimensions is an error. An unary form is expected. See C. =item pop our Scalar multi method pop ( @array: ) is export Remove the last element of C<@array> and return it. =item push our Int multi method push ( @array: *@values ) is export Add to the end of C<@array>, all of the subsequent arguments. =item shift our Scalar multi method shift ( @array: ) is export Remove the first element from C<@array> and return it. =item splice our List multi method splice( @array is rw: Int $offset = 0, Int $size?, *@values ) is export C fills many niches in array-management, but its fundamental behavior is to remove zero or more elements from an array and replace them with a new (and potentially empty) list. This operation can shorten or lengthen the target array. C<$offset> is the index of the array element to start with. It defaults to C<0>. C<$size> is the number of elements to remove from C<@array>. It defaults to removing the rest of the array from C<$offset> on. The slurpy list of values (if any) is then inserted at C<$offset>. Calling splice with a traditional parameter list, you must define C<$offset> and C<$size> if you wish to pass a replacement list of values. To avoid having to pass these otherwise optional parameters, use the piping operator(s): splice(@array,10) <== 1..*; which replaces C<@array[10]> and all subsequent elements with an infinite series starting at C<1>. This behaves similarly to Perl 5's C. If C<@array> is multidimensional, C operates only on the first dimension, and works with Array References. C returns the list of deleted elements in list context, and a reference to a list of deleted elements in scalar context. =item unshift our Int multi method unshift ( @array: *@values ) is export C adds the values onto the start of the C<@array>. =item keys =item kv =item pairs =item values our List multi method keys ( @array: Matcher *@indextests ) is export our List multi method kv ( @array: Matcher *@indextests ) is export our List multi method pairs (@array: Matcher *@indextests ) is export our List multi method values ( @array: Matcher *@indextests ) is export Iterates the elements of C<@array>, in order. If C<@indextests> are provided, only elements whose indices match C<$index ~~ any(@indextests)> are iterated. What is returned at each element of the iteration varies with function. C returns the value of the associated element; C returns a 2 element list in (index, value) order, C a C. C<@array> is considered single dimensional. If it is in fact multi-dimensional, the values returned will be array references to the sub array. In Scalar context, they all return the count of elements that would have been iterated. =back =head2 List The following are defined in the C role/class: =over =item cat our Cat multi cat ( @values ) Returns a C object, a concatenated version of the list that does the C interface, but generates the string lazily to the extent permitted by the pattern of access to the string. Its two primary uses are matching against an array of strings and doing the equivalent of a C, except that C is always eager. However, a C in an interpolative context is also effectively eager, since the interpolator needs to know the string length. List context is lazy, though, so a C of a C is also lazy, and in fact, you just get a flat cat because C in a list context is a no-op. The C interface also lets you interrogate the object at a particular string position without actually stringifying the element; the regex engine can make use of this to match a tree node, for instance, without serializing the entire subtree. Accessing a filehandle as both a filehandle and as a C is undefined, because lazy objects are not required to be as lazy as possible, but may instead choose to precalculate values in semi-eager batches to maximize cache hits. =item classify our List of Pair multi method classify ( @values: Matcher $test ) our List of Pair multi classify ( Matcher $test, *@values ) C takes a list or array of values and returns a lazily evaluated list comprised of pairs whose values are arrays of values from the input list, and whose keys are the return value of the C<$test>, when passed that value. For example: @list = (1, 2, 3, 4); (:@even, :@odd) := classify { $_ % 2 ?? 'odd' !! 'even' } @list; In this example, @even will contain all even numbers from C<@list> and C<@odd> will contain all odd numbers from C<@list>. To simply transform a list into a hash of arrays: %cars_by_color = classify { .color } @cars; red_car_owners(%cars_by_color.map:{.owner}); =item grep our List multi method grep ( @values: Matcher $test ) our List multi grep ( Matcher $test, *@values ) C takes a list or array of values and returns a lazily evaluated list comprised of all of the values from the original list for which the C<$test> smart-matches as true. Here is an example of its use: @friends = grep { .is_friend }, @coworkers; This takes the array C<@coworkers>, checks every element to see which ones return true for the C<.is_friend> method, and returns the resulting list to store into C<@friends>. Note that, unlike in Perl 5, a comma is required after the C in the multi form. =item first our Item multi method first ( @values: Matcher $test ) our Item multi first ( Matcher $test, *@values ) C works exactly like C but returns only the first matching value. =item pick our List multi method pick ( @values: Int $num = 1, Bool :$repl ) our List multi method pick ( @values: Whatever, Bool :$repl ) our List multi pick ( Int $num, Bool :$repl, *@values ) our List multi pick ( Whatever, Bool :$repl, *@values ) C takes a list or array of values and returns a random selection of elements from the list (without replacement unless C<:repl> is indicated). When selecting without replacement if C<*> is specified as the number (or if the number of elements in the list is less than the specified number), all the available elements are returned in random order: @team = @volunteers.pick(5); @shuffled = @deck.pick(*); When selecting with replacement the specified number of picks are provided. In this case C<*> would provide an infinite list of random picks from C<@values>: @byte = (0,1).pick(8, :repl); for (1..20).pick(*, :repl) -> $die_roll { ... } =item join our Str multi method join ( $separator: @values ) our Str multi join ( Str $separator = ' ', *@values ) C returns a single string comprised of all of the elements of C<@values>, separated by C<$separator>. Given an empty list, C returns the empty string. The separator defaults to a single space. To join with no separator, you can use the C<[~]> reduce operator. The C function also effectively does a concatenation with no separator. =item map our List of Capture multi method map ( @values: Code *&expression ) our List of Capture multi map ( Code $expression, *@values ) C returns a lazily evaluated list which is comprised of the return value of the expression, evaluated once for every one of the C<@values> that are passed in. Here is an example of its use: @addresses = map { %addresses_by_name<$_> }, @names; Here we take an array of names, and look each name up in C<%addresses_by_name> in order to build the corresponding list of addresses. If the expression returns no values or multiple values, then the resulting list may not be the same length as the number of values that were passed. For example: @factors = map { prime_factors($_) }, @composites; The actual return value is a multislice containing one slice per map iteration. In most contexts these slices are flattened into a single list. =item reduce our Item multi method reduce ( @values: Code *&expression ) our Item multi reduce ( Code $expression ;; *@values ) { my $res; for @values -> $cur { FIRST {$res = $cur; next;} $res = &$expression($res, $cur); } $res; } =item reverse role Hash { our Hash multi method reverse ( %hash: ) is export { (my %result){%hash.values} = %hash.keys; %result; } } our List multi method reverse ( @values: ) is export our List multi reverse ( *@values ) { gather { 1 while take pop @values; } } role Str { our Str multi method reverse ( $str: ) is export { $str.split('').reverse.join; } } =item sort our Array multi method sort( @values: *&by ) our Array multi method sort( @values: Ordering @by ) our Array multi method sort( @values: Ordering $by = &infix: ) our List multi sort( Ordering @by, *@values ) our List multi sort( Ordering $by = &infix:, *@values ) Returns C<@values> sorted, using criteria C<$by> or C<@by> for comparisons. C<@by> differs from C<$by> in that each criterion is applied, in order, until a non-zero (tie) result is achieved. C is as described in L<"Type Declarations">. Any C may receive either or both of the mixins C and C to reverse the order of sort, or to adjust the case, sign, or other order sensitivity of C. (Mixins are applied to values using C.) If a C is used as an C then sort-specific traits such as C are allowed on the positional elements. If all criteria are exhausted when comparing two elements, sort should return them in the same relative order they had in C<@values>. To sort an array in place use the C<.=sort> mutator form. See L for more details and examples (with C meaning C.) =item min our Array multi method min( @values: *&by ) our Array multi method min( @values: Ordering @by ) our Array multi method min( @values: Ordering $by = &infix: ) our List multi min( Ordering @by, *@values ) our List multi min( Ordering $by = &infix:, *@values ) Returns the earliest (i.e., lowest index) minimum element of C<@values> , using criteria C<$by> or C<@by> for comparisons. C<@by> differs from C<$by> in that each criterion is applied, in order, until a non-zero (tie) result is achieved. C is as described in L<"Type Declarations">. Any C may receive the mixin C to adjust the case, sign, or other order sensitivity of C. (Mixins are applied to values using C.) If a C is used as an C then sort-specific traits such as C are allowed on the positional elements. =item max our Array multi method max( @values: *&by ) our Array multi method max( @values: Ordering @by ) our Array multi method max( @values: Ordering $by = &infix: ) our List multi max( Ordering @by, *@values ) our List multi max( Ordering $by = &infix:, *@values ) Returns the earliest (i.e., lowest index) maximum element of C<@values> , using criteria C<$by> or C<@by> for comparisons. C<@by> differs from C<$by> in that each criterion is applied, in order, until a non-zero (tie) result is achieved. C is as described in L<"Type Declarations">. Any C may receive the mixin C to adjust the case, sign, or other order sensitivity of C. (Mixins are applied to values using C.) If a C is used as an C then sort-specific traits such as C are allowed on the positional elements. =back =head2 Hash The following are defined in the C role. =over 4 =item :delete our List method :delete ( %hash: *@keys ) our Scalar method :delete ( %hash: $key ) is default Deletes the elements specified by C<$key> or C<$keys> from the invocant. returns the value(s) that were associated to those keys: @deleted = %foo.:delete{ @keys } =item exists our Bool method :exists ( %hash: $key ) True if invocant has an element whose key matches C<$key>, false otherwise. See also Code::exists to determine if a function has been declared. (Use defined() to determine whether the function body is defined. A body of ... counts as undefined.) =item keys =item kv =item pairs =item values multi Int|List keys ( %hash ; Matcher *@keytests ) multi Int|List kv ( %hash ; Matcher *@keytests ) multi Int|(List of Pair) pairs (%hash ; Matcher *@keytests ) multi Int|List values ( %hash ; Matcher *@keytests ) Iterates the elements of C<%hash> in no apparent order, but the order will be the same between successive calls to these functions, as long as C<%hash> doesn't change. If C<@keytests> are provided, only elements whose keys evaluate C<$key ~~ any(@keytests)> as true are iterated. What is returned at each element of the iteration varies with function. C only returns the key; C the value; C returns both as a 2 element list in (key, value) order, C a C. Note that C returns the same as C In Scalar context, they all return the count of elements that would have been iterated. The lvalue form of C is not longer supported. Use the C<.buckets> property instead. =back =head2 Str General notes about strings: A Str can exist at several Unicode levels at once. Which level you interact with typically depends on what your current lexical context has declared the "working Unicode level to be". Default is C. [Default can't be C because we don't go into "language" mode unless there's a specific language declaration saying either exactly what language we're going into or, in the absence of that, how to find the exact language somewhere in the enviroment.] Attempting to use a string at a level higher it can support is handled without warning. The current highest supported level of the string is simply mapped Char for Char to the new higher level. However, attempting to stuff something of a higher level a lower-level string is an error (for example, attempting to store Kanji in a Byte string). An explicit conversion function must be used to tell it how you want it encoded. Attempting to use a string at a level lower than what it supports is not allowed. If a function takes a C and returns a C, the returned C will support the same levels as the input, unless specified otherwise. The following are all provided by the C role: =over =item p5chop our Char multi method p5chop ( Str $string is rw: ) is export(:P5) my Char multi p5chop ( Str *@strings is rw ) is export(:P5) Trims the last character from C<$string>, and returns it. Called with a list, it chops each item in turn, and returns the last character chopped. =item chop our Str multi method chop ( Str $string: ) is export Returns string with one Char removed from the end. =item p5chomp our Int multi method p5chomp ( Str $string is rw: ) is export(:P5) my Int multi p5chomp ( Str *@strings is rw ) is export(:P5) Related to C, only removes trailing chars that match C. In either case, it returns the number of chars removed. =item chomp our Str multi method chomp ( Str $string: ) is export Returns string with one newline removed from the end. An arbitrary terminator can be removed if the input filehandle has marked the string for where the "newline" begins. (Presumably this is stored as a property of the string.) Otherwise a standard newline is removed. Note: Most users should just let their I/O handles autochomp instead. (Autochomping is the default.) =item lc our Str multi method lc ( Str $string: ) is export Returns the input string after converting each character to its lowercase form, if uppercase. =item lcfirst our Str multi method lcfirst ( Str $string: ) is export Like C, but only affects the first character. =item uc our Str multi method uc ( Str $string: ) is export Returns the input string after converting each character to its uppercase form, if lowercase. This is not a Unicode "titlecase" operation, but a full "uppercase". =item ucfirst our Str multi method ucfirst ( Str $string: ) is export Performs a Unicode "titlecase" operation on the first character of the string. =item normalize our Str multi method normalize ( Str $string: Bool :$canonical = Bool::True, Bool :$recompose = Bool::False ) is export Performs a Unicode "normalization" operation on the string. This involves decomposing the string into its most basic combining elements, and potentially re-composing it. Full detail on the process of decomposing and re-composing strings in a normalized form is covered in the Unicode specification Sections 3.7, Decomposition and 3.11, Canonical Ordering Behavior of the Unicode Standard, 4.0. Additional named parameters are reserved for future Unicode expansion. For everyday use there are aliases that map to the I document's names for the various modes of normalization: our Str multi method nfd ( Str $string: ) is export { $string.normalize(:cononical, :!recompose); } our Str multi method nfc ( Str $string: ) is export { $string.normalize(:canonical, :recompose); } our Str multi method nfkd ( Str $string: ) is export { $string.normalize(:!canonical, :!recompose); } our Str multi method nfkc ( Str $string: ) is export { $string.normalize(:!canonical, :recompose); } Decomposing a string can be used to compare Unicode strings in a binary form, providing that they use the same encoding. Without decomposing first, two Unicode strings may contain the same text, but not the same byte-for-byte data, even in the same encoding. The decomposition of a string is performed according to tables in the Unicode standard, and should be compatible with decompositions performed by any system. The C<:canonical> flag controls the use of "compatibility decompositions". For example, in canonical mode, "fi" is left unaffected because it is not a composition. However, in compatibility mode, it will be replaced with "fi". Decomposed sequences will be ordered in a canonical way in either mode. The C<:recompose> flag controls the re-composition of decomposed forms. That is, a combining sequence will be re-composed into the canonical composite where possible. These de-compositions and re-compositions are performed recursively, until there is no further work to be done. Note that this function is really only applicable when dealing with codepoint strings. Grapheme strings are normally processed at a higher abstraction level that is independent of normalization, and are lazily normalized into the desired normalization when transferred to lexical scopes or handles that care. =item samecase our Str multi method samecase ( Str $string: Str $pattern ) is export Has the effect of making the case of the string match the case pattern in C<$pattern>. (Used by s:ii/// internally, see L.) =item samebase our Str multi method samebase ( Str $string: Str $pattern ) is export Has the effect of making the case of the string match the accent pattern in C<$pattern>. (Used by s:bb/// internally, see L.) =item capitalize our Str multi method capitalize ( Str $string: ) is export Has the effect of first doing an C on the entire string, then performing a C on it. =item length This word is banned in Perl 6. You must specify units. =item chars our Int multi method chars ( Str $string: ) is export Returns the number of characters in the string in the current (lexically scoped) idea of what a normal character is, usually graphemes. =item graphs our Int multi method codes ( Str $string: ) is export Returns the number of graphemes in the string in a language-independent way. =item codes our Int multi method codes ( Str $string: $nf = $?NF) is export Returns the number of codepoints in the string if it were canonicalized the specified way. Do not confuse codepoints with UTF-16 encoding. Characters above U+FFFF count as a single codepoint. =item bytes our Int multi method bytes ( Str $string: $nf = $?NF, $enc = $?ENC) is export Returns the number of bytes in the string if it were encoded in the specified way. Note the inequality: .bytes("C","UTF-16") * 2 >= .codes("C") This is caused by the possibility of surrogate pairs, which are counted as one codepoint. However, this problem does not arise for UTF-32: .bytes("C","UTF-32") * 4 == .codes("C") =item index our StrPos multi method index( Str $string: Str $substring, StrPos $pos = StrPos(0) ) is export C searches for the first occurrence of C<$substring> in C<$string>, starting at C<$pos>. The value returned is always a C object. If the substring is found, then the C represents the position of the first character of the substring. If the substring is not found, a bare C containing no position is returned. This prototype C evaluates to false because it's really a kind of undef. Do not evaluate as a number, because instead of returning -1 it will return 0 and issue a warning. =item pack our Str multi pack( Str::Encoding $encoding, Pair *@items ) our Str multi pack( Str::Encoding $encoding, Str $template, *@items ) our buf8 multi pack( Pair *@items ) our buf8 multi pack( Str $template, *@items ) C takes a list of pairs and formats the values according to the specification of the keys. Alternately, it takes a string C<$template> and formats the rest of its arguments according to the specifications in the template string. The result is a sequence of bytes. An optional C<$encoding> can be used to specify the character encoding to use in interpreting the result as a C, otherwise the return value will simply be a C containing the bytes generated by the template(s) and value(s). Note that no guarantee is made in terms of the final, internal representation of the string, only that the generated sequence of bytes will be interpreted as a string in the given encoding, and a string containing those graphemes will be returned. If the sequence of bytes represents an invalid string according to C<$encoding>, an exception is generated. Templates are strings of the form: grammar Str::PackTemplate { regex template { [ | ? ]* } token group { \(