Template:Short description Template:About In computer science, zipping is a function which maps a tuple of sequences into a sequence of tuples. This name zip derives from the action of a zipper in that it interleaves two formerly disjoint sequences. The inverse function is unzip.

ExampleEdit

Given the three words cat, fish and be where |cat| is 3, |fish| is 4 and |be| is 2. Let <math>\ell</math> denote the length of the longest word which is fish; <math>\ell = 4</math>. The zip of cat, fish, be is then 4 tuples of elements:

<math> (c,f,b)(a,i,e)(t,s,\#)(\#,h,\#)</math>

where # is a symbol not in the original alphabet. In Haskell this truncates to the shortest sequence <math>\underline{\ell}</math>, where <math>\underline{\ell} = 2</math>:

<syntaxhighlight lang="haskell"> zip3 "cat" "fish" "be" -- [('c','f','b'),('a','i','e')] </syntaxhighlight>

DefinitionEdit

Let Σ be an alphabet, # a symbol not in Σ.

Let x1x2... x|x|, y1y2... y|y|, z1z2... z|z|, ... be n words (i.e. finite sequences) of elements of Σ. Let <math>\ell</math> denote the length of the longest word, i.e. the maximum of |x|, |y|, |z|, ... .

The zip of these words is a finite sequence of n-tuples of elements of Template:Math, i.e. an element of <math>((\Sigma\cup\{\# \})^n)^*</math>:

<math> (x_1,y_1,\ldots)(x_2,y_2,\ldots)\ldots(x_\ell,y_\ell,\ldots)</math>,

where for any index Template:Math, the wi is #.

The zip of x, y, z, ... is denoted zip(x, y, z, ...) or xyz ⋆ ...

The inverse to zip is sometimes denoted unzip.

A variation of the zip operation is defined by:

<math> (x_1,y_1,\ldots)(x_2,y_2,\ldots)\ldots(x_{\underline{\ell}},y_{\underline{\ell}},\ldots)</math>

where <math>\underline{\ell}</math> is the minimum length of the input words. It avoids the use of an adjoined element <math>\#</math>, but destroys information about elements of the input sequences beyond <math>\underline{\ell}</math>.

In programming languagesEdit

Zip functions are often available in programming languages, often referred to as Template:Mono. In Lisp-dialects one can simply Template:Mono the desired function over the desired lists, Template:Mono is variadic in Lisp so it can take an arbitrary number of lists as argument. An example from Clojure:<ref>map from ClojureDocs</ref>

<syntaxhighlight lang="clojure">

`nums' contains an infinite list of numbers (0 1 2 3 ...)

(def nums (range)) (def tens [10 20 30]) (def firstname "Alice")

To zip (0 1 2 3 ...) and [10 20 30] into a vector, invoke `map vector' on them; same with list

(map vector nums tens)  ; ⇒ ([0 10] [1 20] [2 30]) (map list nums tens)  ; ⇒ ((0 10) (1 20) (2 30)) (map str nums tens)  ; ⇒ ("010" "120" "230")

`map' truncates to the shortest sequence; note missing \c and \e from "Alice"

(map vector nums tens firstname) ; ⇒ ([0 10 \A] [1 20 \l] [2 30 \i]) (map str nums tens firstname)  ; ⇒ ("010A" "120l" "230i")

To unzip, apply `map vector' or `map list'

(apply map list (map vector nums tens firstname))

⇒ ((0 1 2) (10 20 30) (\A \l \i))

</syntaxhighlight>

In Common Lisp: <syntaxhighlight lang="lisp"> (defparameter nums '(1 2 3)) (defparameter tens '(10 20 30)) (defparameter firstname "Alice")

(mapcar #'list nums tens)

⇒ ((1 10) (2 20) (3 30))

(mapcar #'list nums tens (coerce firstname 'list))

⇒ ((1 10 #\A) (2 20 #\l) (3 30 #\i)) — truncates on shortest list
Unzips

(apply #'mapcar #'list (mapcar #'list nums tens (coerce firstname 'list)))

⇒ ((1 2 3) (10 20 30) (#\A #\l #\i))

</syntaxhighlight>

Languages such as Python provide a Template:Mono function.<ref name="pydocs">map(function, iterable, ...) from section Built-in Functions from Python v2.7.2 documentation</ref> Template:Mono in conjunction with the Template:Mono operator unzips a list:<ref name="pydocs"/> <syntaxhighlight lang="pycon"> >>> nums = [1, 2, 3] >>> tens = [10, 20, 30] >>> firstname = 'Alice'

>>> zipped = list(zip(nums, tens)) >>> zipped [(1, 10), (2, 20), (3, 30)]

>>> list(zip(*zipped)) # unzip [(1, 2, 3), (10, 20, 30)]

>>> zipped2 = list(zip(nums, tens, list(firstname))) >>> zipped2 # zip, truncates on shortest [(1, 10, 'A'), (2, 20, 'l'), (3, 30, 'i')] >>> list(zip(*zipped2)) # unzip [(1, 2, 3), (10, 20, 30), ('A', 'l', 'i')] </syntaxhighlight>

Haskell has a method of zipping sequences but requires a specific function for each arity (Template:Mono for two sequences, Template:Mono for three etc.),<ref>zip :: [a] -> [b] -> [(a, b)] from Prelude, Basic libraries</ref> similarly the functions Template:Mono and Template:Mono are available for unzipping: <syntaxhighlight lang="haskell"> -- nums contains an infinite list of numbers [1, 2, 3, ...] nums = [1..] tens = [10, 20, 30] firstname = "Alice"

zip nums tens -- ⇒ [(1,10), (2,20), (3,30)] — zip, truncates infinite list unzip $ zip nums tens -- ⇒ ([1,2,3], [10,20,30]) — unzip

zip3 nums tens firstname -- ⇒ [(1,10,'A'), (2,20,'l'), (3,30,'i')] — zip, truncates unzip3 $ zip3 nums tens firstname -- ⇒ ([1,2,3], [10,20,30], "Ali") — unzip </syntaxhighlight>

Language comparisonEdit

List of languages by support of zip:

Zip in various languages
Language Zip Zip 3 lists Zip n lists Notes
Chapel Template:Mono Template:Mono Template:Mono citation CitationClass=web

}}</ref>

Clojure Template:Mono
Template:Mono
Template:Mono
Template:Mono
Template:Mono
Template:Mono
Stops after the length of the shortest list.
Common Lisp Template:Codett Template:Codett Template:Codett Stops after the length of the shortest list.
D Template:Mono
Template:Mono
Template:Mono
Template:Mono
Template:Mono
Template:Mono
citation CitationClass=web

}}</ref> The second form is an example of UFCS.

F# Template:Mono
Template:Mono
Template:Mono
Template:Mono
Template:Mono
Template:Mono
Haskell Template:Mono Template:Mono Template:Mono Template:Mono for n > 3 is available in the module Data.List. Stops after the shortest list ends.
Python Template:Mono Template:Mono Template:Mono Template:Mono and Template:Mono (3.x) stops after the shortest list ends, whereas Template:Mono (2.x) and Template:Mono (3.x) extends the shorter lists with Template:Mono items
Ruby Template:Mono Template:Mono Template:Mono citation CitationClass=web

}}</ref>

Scala Template:Mono citation CitationClass=web

}}</ref>

Unzip in various languages
Language Unzip Unzip 3 tuples Unzip n tuples Notes
Clojure Template:Mono Template:Mono Template:Mono
Common Lisp Template:Codett Template:Codett Template:Codett
F# Template:Mono
Template:Mono
Template:Mono
Template:Mono
Template:Mono
Template:Mono
Haskell Template:Mono Template:Mono Template:Mono Template:Mono for n > 3 is available in the module Template:Mono
Python Template:Mono Template:Mono Template:Mono

See alsoEdit

Template:Portal

ReferencesEdit

<references/>