An important concept in computer science is the free monoid on a set A, which essentially consists of sequences ⟨a_{1}…a_{n}⟩ of elements drawn from A. The key operations on the free monoid are:
 ⟨a⟩, forming a singleton sequence from a single element of A
 x⊕y, concatenation of the sequences x and y, which satisfies the associative law: (x⊕y)⊕z = x⊕(y⊕z)
 ⟨⟩, the empty sequence, which acts as an identity for concatenation: ⟨⟩⊕x = x⊕⟨⟩ = x
The free monoid satisfies the mathematical definition of a monoid, and is free in the sense of satisfying nothing else. There are many possible implementations of the free monoid, but they are all mathematically equivalent, which justifies calling it the free monoid.
In the R language, there are four main implementations of the free monoid: vectors, lists, dataframes (considered as sequences of rows), and strings (although for strings it’s difficult to tell where elements start and stop). The key operations are:

Vectors 
Lists 
Dataframes 
Strings 
⟨⟩, empty 
c() 
list() 
data.frame(n=c()) 
"" 
⟨a⟩, singleton 
implicit (single values are 1element vectors) 
list(a) 
data.frame(n=a) 
as.character(a) 
x⊕y, concatenation 
c(x,y) 
c(x,y) 
rbind(x,y) 
paste0(x,y) 
An arbitrary monoid on a set A is a set B equipped with:
 a function f from A to B
 a binary operation x⊗y, which again satisfies the associative law: (x⊗y)⊗z = x⊗(y⊗z)
 an element e which acts as an identity for the binary operator: e⊗x = x⊗e = x
As an example, we might have A = {2, 3, 5, …} be the prime numbers, B = {1, 2, 3, 4, 5, …} be the positive whole numbers, f(n) = n be the obvious injection function, ⊗ be multiplication, and (of course) e = 1. Then B is a monoid on A.
A homomorphism from the free monoid to B is a function h which respects the monoidonA structure. That is:
 h(⟨⟩) = e
 h(⟨a⟩) = f(a)
 h(x⊕y) = h(x) ⊗ h(y)
As a matter of fact, these restrictions uniquely define the homomorphism from the free monoid to B to be the function which maps the sequence ⟨a_{1}…a_{n}⟩ to f(a_{1}) ⊗ ⋯ ⊗ f(a_{n}).
In other words, simply specifying the monoid B with its function f from A to B and its binary operator ⊗ uniquely defines the homomorphism from the free monoid on A. Furthermore, this homomorphism logically splits into two parts:
 Map: apply the function f to every element of the input sequence ⟨a_{1}…a_{n}⟩
 Reduce: combine the results of mapping using the binary operator, to give f(a_{1}) ⊗ ⋯ ⊗ f(a_{n})
The combination of map and reduce is inherently parallel, since the binary operator ⊗ is associative. If our input sequence is spread out over a hundred computers, each can apply map and reduce to its own segment. The hundred results can then be sent to a central computer where the final 99 ⊗ operations are performed. Among other organisations, Google has made heavy use of this MapReduce paradigm, which goes back to Lisp and APL.
R also provides support for the basic map and reduce operations (albeit with some inconsistencies):

Vectors 
Lists 
Dataframes 
Strings 
Map with f 
sapply(v,f) , purrr::map_dbl(v,f) and related operators, or simply f(v) for vectorized functions 
lapply(x,f) or purrr::map(x,f) 
Vector operations on columns, possibly with dplyr::mutate , dplyr::transmute , purrr::pmap , or mapply 
Not possible, unless strsplit or tokenisation is used 
Reduce with ⊗ 
Reduce(g,v) , purrr::reduce(v,g) , or specific functions like sum , prod , and min 
purrr::reduce(x,g) 
Vector operations on columns, or specific functions like colSums , with purrr::reduce2(x,y,g) useful for twocolumn dataframes 
Not possible, unless strsplit or tokenisation is used 
It can be seen that it is particularly the conceptual reduce operator on dataframes that is poorly supported by the R language. Nevertheless, the map and reduce operations are both powerful mechanisms for manipulating data.
For nonassociative binary operators, purrr::reduce(x,g)
and similar functions remain extremely useful, but they become inherently sequential.
For more about purrr, see purrr.tidyverse.org.
Like this:
Like Loading...