Merikanto

一簫一劍平生意,負盡狂名十五年

QuickSort with Python - 4 Ways of Implementation

QuickSort is a Divide and Conquer algorithm. It picks an element as pivot and partitions the given array around the picked pivot. There are many versions of QuickSort that pick pivot(s) in different ways. Here I’m going to show 4 ways of implementing QuickSort in Python.


Version 1

Version 1 is the most common way to implement QuickSort. Here we pick the first element as the pivot.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
def qs1(t, lo, hi):
if lo >= hi:
return t
p = t[lo] # Pivot is the 1st element

i = lo
j = hi

while i != j:
# 此间顺序不可颠倒
while t[j] >= p and i < j:
j = j - 1
while t[i] <= p and i < j:
i = i + 1
t[i], t[j] = t[j], t[i]
t[lo], t[i] = t[i], t[lo]

qs1(t, lo, i-1)
qs1(t, i+1, hi)

return t

Version 2

Version 2 is from MIT’s Introduction to Algorithms (CLRS).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def qs2(array, l, r):
if l < r:
q = partition(array, l, r)
qs2(array, l, q - 1)
qs2(array, q + 1, r)

def partition(array, l, r):
x = array[r]
i = l - 1
for j in range(l, r):
if array[j] <= x:
i += 1
array[i], array[j] = array[j], array[i]
array[i + 1], array[r] = array[r], array[i+1]
return i + 1

Version 3

Version 3 is a simplified way of implementing QuickSort. The pivot is the last element of the list.

1
2
3
4
5
6
7
8
9
10
11
12
13
def qs3(t):
if len(t) <= 1:
return t
l = []
r = []
p = t.pop()

for c in t:
if c < p:
l.append(c)
else:
r.append(c)
return qs3(l) + [p] + qs3(r)

Version 4

Version 4 is the improved quicksort (from Prof. Siegel’s textbook). Note that less comparisons are made.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def qs4(t, lo, hi):
if lo >= hi:
return t
if t[lo] < t[hi]:
t[lo], t[hi] = t[hi], t[lo]
p = t[hi]
i = lo
j = hi
while i < j:
t[i], t[j] = t[j], t[i]
i = i + 1
while t[i] < p:
i = i + 1
j = j - 1
while t[j] > p:
j = j - 1
t[lo], t[j] = t[j], t[lo]
qs4(t, lo, j-1)
qs4(t, j+1, hi)
return t

Comparison

Among all the three, Version 3 is the fastest when the data isn’t very large (less than 1 million). It also comes with a cost of space, since multiple lists are created. However, when the data is larger than 1 million, Version 4 becomes the fastest. To time the performance, we could use Python’s built-in module time. For instance, to time Version 3’s performance:

1
2
3
4
5
6
7
import time

def timeit():
start = time.time()
qs3(t)
end = time.time()
print ('Version 3: ', end - start)

To generate random numbers for testing, we use Python’s random module. For example,

1
2
3
4
import random

# 1 Million Numbers
t = [random.randint(1, 2000000000) for i in range(1000000)]