=encoding utf8 =head1 TITLE DRAFT: Synopsis 17: Concurrency =head1 AUTHOR Elizabeth Mattijsen Audrey Tang =head1 VERSION Maintainer: Elizabeth Mattijsen Contributions: Christoph Buchetmann Date: 13 Jun 2005 Last Modified: 4 Mar 2008 Number 0 Version: 3 This is a draft document. After being some time under the surface of Perl 6 development this is a attempt to document working concurrency issues, list the remaining todos and mark the probably obsolete and redundant points. =head1 Overview Concurrency can take many forms in Perl 6. With varying degrees of explicitness and control capabilities. This document attempts to describe what these capabilities are and in which form they can be accessed in Perl 6. =head2 Processes, threads, fibers? Concurrency comes in many shapes and forms. Most Perl users are used to the concept of a "process" or a "thread" (usually depending on the OS they work on). Some systems even are familiar with very lightweight threads called "fibers". When discussing issues about concurrency with different people, it soon becomes apparent that everybody has his own set of "understandings" about what each word means, which doesn't make it any easier to describe Perl 6 concurrency. It seemed the most natural to use the word "thread" to describe a process which has its own context, but also shares context with 0 or more concurrently running processes. Depending on your OS, or even specific version of your OS, this could still be a single "process" from the OS's point of view. Or it could contain an OS process for each thread. Or any mixture of these two implementations. In this document we try to be agnostic about this: all we know in Perl 6 are "threads", which have their own context and share context with other concurrently running "threads". Whether they be process, threads or fibres at the OS level should not matter at the Perl 6 level. And for sake of consistency, an unthreaded "normal" program is considered to be also running in a single thread. =head2 Variables In the past, there have been two models for concurrent processes in Perl. In general, these are referred to as "5.005 threads" (C) and "ithreads" (C). The main difference between these two models from a programmer's point of view, is that variables in "5.005 threads" are shared by default. Whereas in the "ithreads" model, only variables that have been indicated to be "shared", are actually shared between threads. All other variable values are actually copies of the variable's value in the "parent" thread. With regards to variables, the concurrency model of Perl 6 is closer to the "5.005 threads" model than it is to the "ithreads" model. In fact, all variables "visible" to a particular scope in Perl 6 will be accessible and modifiable (if allowed to do so) from all of the concurrent processes that start from that scope. In that sense, one could consider the "ithreads" model as a historical diversion: the Perl 6 concurrency picks up where the "5.005 threads" path left off. (EM: maybe point out that the "ithreads" behaviour can be simulated with some kind of copy-on-write magic to be automagically added to all variable access inside a thread, except for those with an explicit "is shared" attribute?) =head1 Contend/Maybe/Defer =head2 No user accessible locks Differently from any current concurrent process implementation in Perl, there are no user accessible locks. Instead, the concept of Software Transactional Memory is used. This is in concept similar to the use of BEGIN TRANSACTION ... do your uninterruptible actions COMMIT in the database world. More interestingly, this also includes the concept of rollback: BEGIN TRANSACTION ... do your stuff, but impossible to complete: ROLLBACK This causes the state of the process to be reverted to the state at the moment the BEGIN TRANSACTION was executed. Perl 6 supports this concept through C blocks. These sections are guaranteed to either be completed totally (when the Code block is exited), or have their state reverted to the state at the start of the Code block (with the L statement). (EM: maybe point out if / how old style locks can be "simulated", for those needing a migration path?) =head2 Atomic Code blocks my ($x, $y); sub c { $x -= 3; $y += 3; $x < 10 or defer; } sub d { $x += 3; $y -= 3; $y < 10 or defer; } contend { # ... maybe { c() } maybe { d() }; # ... } A Code block can be prefixed with C. This means that code executed inside that scope is guaranteed not to be interrupted in any way. The start of a block marked C also becomes a I to which execution can return (in exactly the same state) if a problem occurs (a.k.a. a L is done) inside the scope of the Code block. =head3 defer The C function basically restores the state of the thread at the last checkpoint and will wait there until an external event allows it to potentially run that atomic C section of code again without having to defer again. If there are no external events possible that could restart execution, an exception will be raised. The last checkpoint is either the outermost C boundary, or the most immediate caller constructed with C. =head3 maybe The C statement causes a checkpoint to be made for C for each block in the C chain, creating an alternate execution path to be followed when a C is done. For example: maybe { ... some_condition() or defer; ... } maybe { ... some_other_condition() or defer; ... } maybe { ... } If placed outside a C block, the C statement creates its own C barrier. =head3 limitations Because Perl 6 must be able to revert its state to the state it had at the checkpoint, it is not allowed to perform any non-revertible actions. These would include reading / writing from file handles that do not support C (such as sockets). Attempting to do so will cause a fatal error to occur. This will probably need to be expanded to all objects: any object that has some interface with data "outside" of the knowledge of the language (e.g. an interface with an external XML library) would also need to provide some method for freezing a state, and restoring to a previously frozen state. If you're not interested in revertability, but are interested in uninterruptability, you could use the "is critical" trait. =head3 Critical Code blocks sub tricky is critical { # code accessing external info, not to be interrupted } if ($update) { is critical; # code accessing external info, not to be interrupted } A Code block marked "is critical" can not be interrupted in any way. But since it is able to access non-revertible data structures (such as non-seekable file handles), it cannot do a C as it would be impossible to restore the state to the beginning of the Code block. =head3 Mixing Atomic and Critical Both "atomic" as well as "critical" propagate down the call chain. This means that any subroutine that in itself is not "atomic" or "critical" becomes uninterruptible if called inside a code block that is marked as "atomic" or "critical". Atomic Code blocks called inside the call chain of a "critical" code block do not pose a problem, as they are more restrictive. Any code that attempts to perform any non-revertible action (e.g. reading from a socket) will cause a fatal error when called inside the call chain of an Atomic Code block. =head1 Co-Routines There is no real difference between a subroutine (or method) and a so-called 'co-routine'. The only difference is how control is relinquished by the subroutine. If this is done with a C statement, the subroutine is considered to be "normal" (i.e. the next time the subroutine is called, execution will start again at the top of the subroutine. A different situation occurs when the control is returned by a subroutine to its caller by means of a C statement. From the caller's point of view, there is no difference with the C statement. However, the next time the subroutine is called, execution will continue after the last C statement, B from the beginning of the subroutine. If there are no statements after the last C statement executed, then execution will start from the start of the called subroutine again. Please note that there is no concurrency involved with the C statement. Execution in the calling sub will halt until the called subroutine returns, regardless of whether this happens by a C or a C statement. Parameters passed to a subroutine that has previously returned with C will be handled as if if was an initial call to the subroutine. This means that the values of named parameters will be placed in their expected place inside the subroutine. Positional parameters will be available to any code looking at them, but will B be handled automatically. It would seem that the use of named parameters is therefore advisable. =head2 Coroutine attributes =over =item finished True on return() and false on midway yield() =back =head2 Coroutine methods =over =item start Set starting parameter list. =back =head2 Coroutine examples =over =item Coro as a function coro dbl { yield $_ * 2; yield $_; }; (1..4).map:{ dbl($_) } # should result in 2 2 6 4 # coro parameters coro perm ( @x is copy ) { while @x { @x.splice($_,1).yield; } } =item Constant coro coro foo { yield 42 }; # always return 42 =item Yield and return coro foo ($x) { yield $x; # this point with $x bound to 10 yield $x+1; return 5; ... # this is never reached, I think we all agree } =back =head1 Threads All outside of a thread defined variables are shared and transactional variables by default Program will wait for _all_ threads. Unjoined threads will be joined at the beginning of the END block batch of the parent thread that spawned them =head2 Thread creation A thread will be created using the keyword C followed by a codeblock being executed in this thread. my $thr = async { ...do something... END { } }; =head2 Thread status and attributes =over =item Self reflection TODO: how you can access thread attributes inside a thread async { say "my tid is ", +self; }; =item started start time =item finished end time =item waiting suspended (not diff from block on wakeup signal) waiting on a handle, a condition, a lock, et cetera otherwise returns false for running threads if it's finished then it's undef(?) =item current_continuation the CC currently running in that thread =item wake_on_readable, wake_on_writable, wake_on TODO: IO objects and containers gets concurrency love! $obj.wake_on_either_readable_or_writable_or_passed_time(3); # fixme fixme $obj.wake_on:{.readable} # busy wait, probably my @a is Array::Chan = 1..Inf; async { @a.push(1) }; async { @a.blocking_shift({ ... }) }; async { @a.unshift({ ... }) }; =back =head2 Thread operators =over =item Stringify Stringify to something sensible (eg. ""); my $thr = async { ... }; say ~$thr; =item Numerify Numify to TIDs (as in pugs) my $thr = async { ... }; say +$thr; =item Enumerable TODO: Enumerable with Conc.list =back =head2 Thread methods =over =item yield TODO: Conc.yield (if this is to live but deprecated, maybe call it sleep(0)?) =item sleep sleep() always respects other threads, thank you very much =item join wait for invocant to finish (always item cxt) my $thr = async { ... }; $thr.join(); =item die throw exception in the invocant thread =item alarm set up alarms =item alarms query existing alarms =item suspend pause a thread; fail if already paused =item resume revive a thread; fail if already running =item detach survives parent thread demise (promoted to process) process-local changes no longer affects parent tentatively, the control methods still applies to it including wait (which will always return undef) also needs to discard any atomicity context =item "is throttled" trait TODO: method throttled::trait_auxiliary: ($limit=1, :$key=gensym()) { # "is throttled" limits max connection to this Code object # the throttling is shared among closures with the same key # the limit may differ on closures with the same key. # if the counter with the "key" equals or exceeds a closure's limit, # the closure can't be entered until it's released # (this can be trivially implemented using contend+defer) } class Foo { method a is throttled(:limit(3) :key) { ... } method b is throttled(:limit(2) :key) { ... } } my Foo $f .= new; async { $f.a } async { $f.b } =back =head2 Signals Asynchronous exceptions are just like user-initiated exceptions with C, so you can also catch it with regular C blocks as specified in S04. To declare your main program catches INT signals, put a CATCH block anywhere in the toplevel to handle exceptions like this: CATCH { when Error::Signal::INT { ... } } =head2 Alarm An alarm is just a pre-arranged exception to be delivered to your program. By the time alarm has arrived, the current block may have already finished executing, so you would need to set up CATCH blocks in places where an alarm can rise to handle it properly. You can request an alarm using the number of seconds, or with a target date. It returns a proxy alarm object that you can do interesting things with. multi Alarm *alarm (Num $seconds = $CALLER::_, &do = {die Sig::ALARM}, :$repeat = 1) multi Alarm *alarm (Date $date, &do = {die Sig::ALARM}, :$repeat = 1) Perl 6's C has three additional features over traditional alarms: =head3 Multiple and Lexical Alarms One can set up multiple alarms using repeated alarm calls: { my $a1 = alarm(2); my $a2 = alarm(2); sleep 10; CATCH { is critical; # if you don't want $a2 to be raised inside this when Sig::ALARM { ... } } } To stop an alarm, call C<$alarm.stop>. The C method for Conc objects (including process and threads) returns a list of alarms currently scheduled for that concurrent context. When an alarm object is garbage collected, the alarm is stopped automatically. Under void context, the implicit alarm object can only be stopped by querying C<.alarms> on the current process. We are not sure what C would mean. Probably a deprecation warning? =head3 Repeated Alarms If you request a repeated alarm using the C named argument, it will attempt to fire off the alarm that many times. However, the alarm will be suppressed when inside a C block that's already handling the exception raised by I alarm. To repeat 0 times is to not fire off any alarms at all. To repeat +Inf times is to repeat over and over again. =head3 Callbacks in Alarms You can arrange a callback (like JavaScript's setTimeOut) in C, which will then be invoked with the then-current code as caller. If you set up such a callback to another Conc object, what happens is just like when you called C<.die> on behalf of that object -- namely, the callback closure, along with anything it referenced, is shared to the target Conc context. Unlike in Perl 5's ithreads where you cannot share anything after the fact, this allows passing shared objects in an C fashion across concurrent parts of the program. Under the default (multiplexing) concurrency model, this is basically a no-op. =head2 Continuations =head2 Junctive Autothreading and Hyper Operations Live in userland for the time being. =head2 Still more or less unorganized stuff ### INTERFACE BARRIER ### module Blah; { is atomic; # contend/maybe/whatever other rollback stuff # limitation: no external IO (without lethal warnings anyway) # can't do anything irreversible is critical; # free to do anything irreversible # means "don't interrupt me" # in system with critical section, no interrupts from # other threads will happen during execution # you can't suspend me my $boo is export; $boo = 1; # We decree that this part forms the static interface # it's run once during initial compilation under the # Separate Compilation doctrine and the syms sealed off # to form part of bytecode syms headers %CALLER::<&blah> = { 1 }; # work - adds to export set die "Eureka!" if %CALLER::<$sym>; # never dies # BEGIN { $boo = time }; sub IMPORT { # VERY DYNAMIC! our $i = time; %CALLER::<&blah> = { 1 }; # work - adds to export set die "Eureka!" if %CALLER::<$sym>; # probes interactively } } ### INTERFACE BARRIER ### =head2 See also =over =item * L =item * L =item * L =item * L =back =cut