PHP Pipe operator v3 Accepted

4 months ago 17

rfc:pipe-operator-v3

  • Version: 0.9

  • Date: 2025-02-05

  • Author: Larry Garfield ([email protected])

  • Status: Approved

Introduction

In object-oriented code, “composition” generally means “one object having a reference to another.” In functional programming, “composition” generally means “sticking two functions together end-to-end to make a new function.” Both are valid and useful techniques, especially in a multi-paradigm language like PHP.

Composition generally takes two forms: Immediate and delayed. The immediate execution of chained functions is typically implemented with a “pipe” operator. Delayed execution is typically implemented with a composition operator, which takes two functions and produces a new function that will call each one in turn. The combination of the two cleanly enables “point-free style,” an approach to programming that limits the use of unnecessary intermediary variables. Point-free style has been gaining popularity in JavaScript circles, so will be familiar to JavaScript developers using that style.

This RFC introduces the “pipe” operator, in the form used by most other languages with such functionality. A function composition operator is saved for a follow up RFC. (See Future Scope.)

For example:

function getUsers(): array { return [ new User('root', isAdmin: true), new User('john.doe', isAdmin: false), ]; }   function isAdmin(User $user): bool { return $user->isAdmin; }   // This is the new syntax. $numberOfAdmins = getUsers() |> fn ($list) => array_filter($list, isAdmin(...)) |> count(...);   var_dump($numberOfAdmins); // int(1);

Proposal

This RFC introduces a new operator:

mixed |> callable;

The |> operator, or “pipe,” accepts a single-parameter callable on the right and passes the left-side value to it, evaluating to the callable's result.

Pipe (|>) evaluates left to right by passing the value (or expression result) on the left as the first and only parameter to the callable on the right. That is, the following two code fragments are logically equivalent:

$result = "Hello World" |> strlen(...);   $result = strlen("Hello World");

For a single call that is not especially useful. It becomes useful when multiple calls are chained together. That is, the following two code fragments are effectively equivalent:

$result = "Hello World" |> htmlentities(...) |> str_split(...) |> fn($x) => array_map(strtoupper(...), $x) |> fn($x) => array_filter($x, fn($v) => $v != 'O'); $temp = "Hello World"; $temp = htmlentities($temp); $temp = str_split($temp); $temp = array_map(strtoupper(...), $temp); $temp = array_filter($temp, fn($v) => $v != 'O'); $result = $temp;

The left-hand side of the pipe may be any value or expression. The right-hand side may be any valid PHP callable that takes a single parameter, or any expression that evaluates to such a callable. Functions with more than one required parameter are not allowed and will fail as if the function were called normally with insufficient arguments. If the right-hand side does not evaluate to a valid callable it will throw an Error.

A pipe chain is an expression, and therefore may be used anywhere an expression is valid.

Precedence

The pipe operator is left-associative. The left side will be evaluated first, before the right side.

The pipe operator precedence has been selected around expected common use cases. In particular, it binds before comparison operations so that its result may be compared, but after mathematical operations. If appropriate, parentheses can be added around any set of operations to alter or clarify the precedence, just like any other expression.

// These are equivalent. $res1 = 5 + 2 |> someFunc(...); $res1 = (5 + 2) |> someFunc(...);   // The result of the pipe chain is compared against 4. $res1 = 'beep' |> strlen(...) == 4;   // The pipe executes before the ??, so the // default value applies to the result of the whole chain. $user = $id |> get_username(...) ?? 'default';   // This requires parens $res1 = 5 |> ($user_specified_func ?? defaultFunc(...));   // This requires parens to allow the // ternary to run first and select the callable to use. $res1 = 5 |> ($config['flag'] ? enabledFunc(...) : disabledFunc(...));

Performance

The current implementation works entirely at the compiler level, and effectively transforms the first example above into the second at compile time. The result is that pipe itself has virtually no runtime overhead. (Any additional closures created by the user while writing a pipe will of course have their own overhead.)

More precisely, the implementation contains 3 optimizations, for a function-style, method-style, or static method style first-class-callable. Those will be compiled down to direct calls, and therefore have no performance overhead. An arbitrary expression will be evaluated and then executed, with no additional overhead. However, that arbitrary expression may have additional logic in it to support a pipe that cannot be compiled away. For example, an inline arrow function that simply forwards the call is not detectable, so the extra arrow function cannot be optimized away.

That means the above example would more precisely compile to:

$temp = "Hello World"; $temp = htmlentities($temp); $temp = str_split($temp); $temp = fn($x) => array_map(strtoupper(...), $x)($temp); $temp = fn($x) => array_filter($x, fn($v) => $v != 'O')($temp); $result = $temp;

That gives a little overhead over making all calls directly, but only in some cases, and not dramatically so.

Should the proposed follow-up RFC of Partial Function Application RFC (see below) be approved, it would be logical and recommended for a similar optimization to be made when a PFA appears on the right-side of pipe. That would eliminate most common uses of a wrapping closure as well.

Callable styles

Pipe supports any callable syntax supported by PHP. At present, the most common form is first-class-callables (eg, strlen(...)), which dovetails with this syntax very cleanly. Should further improvements be made in the future, such as a revised Partial Function Application RFC, it would be supported naturally.

References

As usual, references are an issue. Supporting pass-by-ref parameters in simple cases is quite easy, and a naive implementation would support it. However, passing a value from a compound value (an object property or array element) by reference does not work, and throws an “Argument could not be passed by reference” error. In practice, it is easier to forbid pass-by-ref parameters in pipe than to allow them.

$arr = ['a' => 'A', 'b' => 'B'];   $val = 'C';   function inc_print(&$v) { $v++; print $v; }   // This can be made to work. $val |> inc_print(...);   // This cannot be easily made to work, and it might not even be possible. $arr |> inc_print(...);

That is also consistent with the typical usage patterns. The whole design of the pipe operator is that data flows through it from left to right, in pure-functional way. Passing by reference would introduce all sorts of potential “spooky action at a distance.” In practice, there are few if any use cases where it would be appropriate to do in the first place.

For that reason, pass-by-ref callables are disallowed on the right-hand side of a pipe operator. That is, both examples above would error.

One exception to this is “prefer-ref” functions, which only exist in the stdlib and cannot be implemented in user-space. There are a small handful of functions that will accept either a reference or a direct value, and vary their behavior depending on which they get. When those functions are used with the pipe operator, the value will be passed by value, and the function will behave accordingly.

Syntax choice

F#, Elixir, and OCaml all use the |> operator already for this exact same behavior. There has been a long-standing discussion in JavaScript about adding a |> operator as described here. It is the standard operator for this task.

Use cases

The use cases for a pipe operator are varied. They include, among others, encouraging shallow-function-nesting, encouraging pure functions, expressing a complex process in a single expression, and emulating extension functions.

The following examples are all simplified from real-world use cases in code I have written.

String manipulation

// Simplified version. The original code this is // based on splits a string at _ and capitals, too, but // that is omitted to focus on the core point. function splitString(string $input): array { return explode(' ', $input); }   // Convert a string to snake_case   $result = 'Fred Flintstone' |> splitString(...) // Produces an array of individual words. |> fn($x) => implode('_', $x) // Join those words with _ |> strtolower(...) // Lowercase everything. ;   // $result is 'fred_flintstone'   // Convert a string to lowerCamelCase   $result = 'Fred Flintstone' |> splitString(...), |> fn($x) => array_map(ucfirst(...), $x) // Uppercase the first letter of each word |> fn($x) => implode('', $x) // Join those words |> lcfirst(...) // Now lowercase just the first letter ;   // $result is 'fredFlintstone'

Array combination

$arr = [ new Widget(tags: ['a', 'b', 'c']), new Widget(tags: ['c', 'd', 'e']), new Widget(tags: ['x', 'y', 'a']), ];   $result = $arr |> fn($x) => array_column($x, 'tags') // Gets an array of arrays |> fn($x) => array_merge(...$x) // Flatten that array into one big array |> array_unique(...) // Remove duplicates |> array_values(...) // Reindex the array. ;   // $result is ['a', 'b', 'c', 'd', 'e', 'x', 'y'. 'z']

The single-expression alternative today would be:

array_values(array_unique(array_merge(...array_column($arr, 'tags'))));

Which I believe is indisputably worse.

Shallow calls

The use of a pipe for function composition also helps to separate closely related tasks so they can be developed and tested in isolation. For a (slightly) contrived and simple example, consider:

function loadWidget($id): Widget { $record = DB::query("something"); return makeWidget($record); }   function loadMany(array $ids): array { $data = DB::query("something"); $ret = []; foreach ($data as $record) { $ret[] = $this->makeWidget($record); } return $ret; }   function makeWidget(array $record): Widget // Assume this is more complicated. return new Widget(...$record); }

In this code, it is impossible to test loadWidget() or loadMany() without also executing makeWidget(). While in this trivial example that's not a huge problem, in a more complex example it often is, especially if several functions/methods are nested more deeply. Dependency injection cannot fully solve this problem, unless each step is in a separate injected class.

By making it easy to chain functions together, however, that can be rebuilt like this:

function loadWidget($id): array { return DB::query("something"); }   function loadMany(array $ids): array { return DB::query("something else"); }   function makeWidget(array $record): Widget // Assume this is more complicated. return new Widget(...$record); }   $widget = loadWidget(5) |> makeWidget(...);   $widgets = [1, 4, 5] |> loadMany(...) |> fn(array $records) => array_map(makeWidget(...), $records);

And the latter could be further simplified with either a higher-order function or partial function application. Those chains could also be wrapped up into their own functions/methods for trivial reuse. They can also be extended, too. For instance, the result of loadMany() is most likely going to be used in a foreach() loop. That's a simple further step in the chain.

$profit = [1, 4, 5] |> loadMany(...) |> fn(array $records) => array_map(makeWidget(...), $records) |> fn(array $ws) => array_filter(isOnSale(...), $ws) |> fn(array $ws) => array_map(sellWidget(...), $ws) |> array_sum(...);

Moreover, because a pipe can take any callable, a pipe chain can be easily packaged up, either as a named function or anon function.

// This would be the "real" API that most code uses. function loadSeveral($id) { return $id |> loadMany(...) |> fn(array $records) => array_map(makeWidget(...), $records); }   $profit = [1, 4, 5] |> loadSeveral(...) |> fn(array $ws) => array_filter(isOnSale(...), $ws) |> fn(array $ws) => array_map(sellWidget(...), $ws) |> array_sum(...);

That neatly encapsulates the entire logic flow of a process in a clear, compact, highly-testable set of operations.

Pseudo-extension functions

“Extension functions” are a feature of Kotlin and C# (and possibly other languages) that allow for a function to act as though it is a method of another object. It has only public-read access, but has the ergonomics of a method. While not a perfect substitute, pipes do offer similar capability with a little more work.

For instance, we could easily make utility higher-order functions (functions that take or return other functions/callables) that will map or filter an array that is piped to them. (A more robust version that also handles iterables is only slightly more work.)

function amap(callable $c): \Closure { return fn(array $a) => array_map($c, $a); }   function afilter(callable $c): \Closure { return fn(array $a) => array_filter($a, $c); }

That allows them to be used, via pipes, in a manner similar to “scalar methods.” To reuse the earlier example:

$profit = [1, 4, 5] |> loadSeveral(...) |> afilter(isOnSale(...)) |> amap(sellWidget(...)) |> array_sum(...);

Which is not far off from what it would look like with scalar methods, which still wouldn't work if any step along the way contained an object:

$profit = [1, 4, 5] ->loadSeveral(...) ->afilter(isOnSale(...)) ->amap(sellWidget(...)) ->sum(...);

But can work with any value type, object or scalar. It also entirely removes the “does the subject come first or last” question: the subject is piped, and the arguments to the higher-order function are the modifiers. It also eliminates the need to discuss which methods deserve to be “first class” operations that turn into methods. Any function can be chained onto a value of any type.

While I do not believe pipes can completely replace extension functions, they provide a reasonable emulation and most of the benefits, for trivial cost.

This RFC does not propose any such higher-order functions for the PHP standard library, as most are quite easy to implement in user space. However, such could be easily added in the future if desired for especially common cases.

Single-expression pipelines

Of particular note, all of the above examples are a single expression. That makes them trivial to use in places where only a single-expression is allowed, such as match() arms, short-get property hooks, short-closures, etc. For example:

$string = 'something GoesHERE';   $newString = match ($format) { 'snake_case' => $string |> splitString(...) |> fn($x) => implode('_', $x) |> strtolower(...), 'lowerCamel' => $string |> splitString(...), |> fn($x) => array_map(ucfirst(...), $x) |> fn($x) => implode('', $x) |> lcfirst(...), // Other case options here. };   class BunchOfTags { private array $widgets = [];   public array $tags { get => $this->widgets |> fn($x) => array_column($x, 'tags') |> fn($x) => array_merge(...$x) |> array_unique(...) |> array_values(...); } }   $loadSeveral = fn($id) => $id |> loadMany(...) |> fn(array $records) => array_map(makeWidget(...), $records);

Streams

Similarly, a few small utility functions (either in C or PHP) would allow pipes to be used with stream resources. As a simplified example:

function decode_rot13($fp): \Generator { while ($c = fgetc($fp)) { yield str_rot13($c); } }   // Takes an iterable of strings and returns a line-buffering version of it. // A more robust version is of course possible, but longer. function lines_from_charstream(iterable $it): \Closure { $buffer = ''; return static function () use ($it, &$buffer) { foreach ($it as $c) { $buffer .= $c; while (($pos = strpos($buffer, PHP_EOL)) !== false) { yield substr($buffer, 0, $pos); $buffer = substr($buffer, $pos); } } }; }     fopen('pipes.md', 'rb') // No variable, so it will close automatically when GCed. |> decode_rot13(...) |> lines_from_charstream(...) |> map(str_getcsv(...)) |> map(Product::create(...)) |> map($repo->save(...)) ;

This is just a demonstration of the potential, not a recommendation for a specific API. But hopefully it shows the potential for making working with streams in a structured way vastly easier, in a way found in a number of languages.

Existing implementations

Multiple user-space libraries exist in PHP that attempt to replicate pipe-like or compose-like behavior. All are clunky and complex by necessity compared to a native solution. There is clear demand for this functionality, but user-space's ability to provide it is currently limited. This list has only grown since the Pipes v2 RFC, indicating an even stronger benefit to the PHP ecosystem with a solid built-in composition syntax.

  • The PHP League has a Pipeline library that encourages wrapping all functions into classes with an __invoke() method to allow them to be referenced, and using a ->pipe() call for each step.

  • Sebastiaan Luca has a pipe library that works through abuse of the __call method. It only works for named functions, I believe, not for arbitrary callables.

  • PipePie is another very similar implementation to the previous ones.

  • ZenPipe is a new-comer that also uses a method named pipe() for what is actually a composition operation.

  • Crell/fp provides pipe() and compose() functions that take an array of callables. While the lightest-weight option on this list, that makes dynamically-built pipelines or compositions more cumbersome than the syntax proposed here.

Those libraries would be mostly obsoleted by this RFC (in combination with the compose follow on, as noted in future-scope), with a more compact, more universal, better-performing syntax.

Why in the engine?

The biggest limitation of any user-space implementation is performance. Even the most minimal implementation (Crell/fp) requires adding 2-3 function calls to every operation, which is relatively expensive in PHP. A native implementation would not have that additional overhead. Crell/fp also results in somewhat awkward function nesting, like this:

pipe($someVal, htmlentities(...), str_split(...), fn($x) => array_map(strtoupper(...), $x), fn($x) => array_filter($x, fn($v) => $v != 'O'), );   // (Or worse if you need conditional stages.)

Compared to the compiled version, this has two extra wrapping closures (for htmlentities() and str_split()) that could not be compiled away, the call to pipe() itself, and a foreach loop inside pipe. All of those are eliminated with a native operator. As noted above, PFA would also allow eliminating the two manual closures via compiler optimization, something a user-space implementation would never be able to do.

More elaborate implementations tend to involve magic methods (which are substantially slower than normal function/method calls) or multi-layer middlewares, which are severe overkill for sticking two functions together.

Additionally, a native operator would make it much easier for static analysis tools to ensure compatible types. The SA tools would know the input value's type, in most cases the callable type on the RHS, and could compare them directly without several layers of obfuscated user-space function calls between them.

Future Scope

This RFC is deliberately “step 1” of several closely related features to make composition-based code easier and more ergonomic. It offers benefit on its own, but deliberately dovetails with several other features that are worthy of their own RFCs.

Language features

A compose operator for closures (likely +). Where pipe executes immediately, compose creates a new callable (Closure) that composes two or more other Closures. That allows a new operation to be defined simply and easily and then saved for later in a variable. Because it is “just” an operator, it is compatible with all other language features. That means, for example, conditionally building up a pipeline is just a matter of throwing if statements around as appropriate. The author firmly believes that a compose operator is a necessary companion to pipe, and the functionality will be incomplete without it. However, while pipe can be implemented trivially in the compile step, a compose operator will require non-trivial runtime work. For that reason it has been split out to its own RFC.

General partial function application. While the prior RFC was declined due to its perceived use cases being insufficient to justify its complexity, there was clear interest in it, and it would vastly improve the usability of function composition. If a less complex implementation can be found, it would most likely pass and complement this RFC well.

A “close but accept object” symbol. Currently, it's not convenient to call a method on an object returned in a pipe step without wrapping it into a closure, like so:

$obj |> fn($x) => $x->foo(4, 5);

One possible alternative would be a syntax such as $$->foo(4, 5), which would be logically equivalent to the above. That would also work for properties, allowing easy reading or writing of “the object in question.” It's likely we could optimize that syntax away the same way FCCs are in this RFC.

A __bind method or similar on objects, possibly with a dedicated operator of its own (such as >>=). If implemented by an object on the left-hand side, the right-hand side would be passed to that method to invoke as it sees fit. Such a feature would be sufficient to support arbitrary monadic behavior in PHP in a type-friendly way.

Iterable API

There has been on-again-off-again discussion of a new iterable API for some time, one that could replace array_map(), etc. with a more ergonomic, iterator-friendly API. Pipes are a natural fit for that.

Consider the case of having reimplemented iter\map(), iter\filter(), iter\first(), iter\unique(), and so on, in C. Such functions could be written to accept both arrays and iterables, and to return the same type they were given. Combined with Partial Application, that would lead to:

$result = $pdo->query("Some complex SQL") |> filter(?, someFilter(...)) |> map(?, transformer(...)) |> unique(...) |> first(someCriteria(...));

There are likely other optimizations that a C implementation could make. All functions would be both pipe-friendly and usable stand-alone.

An iterable API is not included in this RFC, but this RFC would make implementing one in a fully flexible way substantially easier.

Rejected Features

There was discussion both on-list and off of “auto-partialling” functions after a pipe, such that the operand value is always passed as the first argument of the right side function. That would allow functions to be written in a “pipe friendly” way by assuming subject-first, and then would not need Partial Application or a wrapping closure in order to provide additional arguments. This is the way Elixir works, for instance. For example:

// This RFC today $foo |> bar(...) |> fn($x) => array_filter($x, fn($v) => $v != 'O');   // If PFA passes: $foo |> bar(...) |> array_filter(?, fn($v) => $v != 'O');   // With Elixir style $foo |> bar() |> array_filter(fn($v) => $v != 'O');

(The use of expressions on the RHS would be supported by wrapping them in parens, to indicate “don't auto-partial.”)

While the author found that compelling, several commenters felt it would be too surprising and unexpected for developers. For that reason, it has been left out of this RFC. It would be possible to add support in the future, but only if the auto-partialling became “opt in” with some additional syntax rather than “opt out.” (Like $foo |> @array_filter(fn($v) => $v != 'O') or something like that.)

Backward Incompatible Changes

None

Proposed PHP Version(s)

8.5

Open Issues

None

Proposed Voting Choices

Yes or no vote. 2/3 required to pass.

Patches and Tests

Implementation

After the project is implemented, this section should contain

  1. the version(s) it was merged into

  2. a link to the git commit(s)

  3. a link to the PHP manual entry for the feature

  4. a link to the language specification section (if any)

References

Links to external references, discussions or RFCs

Rejected Features

rfc/pipe-operator-v3.txt

· Last modified: 2025/05/28 04:17 by

crell


Read Entire Article