'Lecture 7 | Missing Semester Solutions'

First, let's compare the sorting algorythms with a script using cProfile:

cProfile_sorts.py

import random
import cProfile
from sorts import insertionsort, quicksort, quicksort_inplace

# Generate a test array
arr = [random.randint(0, 1000) for _ in range(1000)]

# Profile Insertion Sort
print("Profiling Insertion Sort:")
cProfile.run('insertionsort(arr.copy())')

# Profile Quicksort (Non-In-Place)
print("\nProfiling Quicksort:")
cProfile.run('quicksort(arr.copy())')

# Profile In-Place Quicksort
print("\nProfiling In-Place Quicksort:")
cProfile.run('quicksort_inplace(arr.copy())')

Then, run the python script:

python cProfile_sorts.py

Terminal Output:

Profiling Insertion Sort:
         6 function calls in 0.021 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.021    0.021 :1()
        1    0.021    0.021    0.021    0.021 sorts.py:10(insertionsort)
        1    0.000    0.000    0.021    0.021 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        1    0.000    0.000    0.000    0.000 {method 'copy' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}



Profiling Quicksort:
         4038 function calls (2694 primitive calls) in 0.003 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.003    0.003 :1()
   1345/1    0.001    0.000    0.003    0.003 sorts.py:22(quicksort)
      672    0.001    0.000    0.001    0.000 sorts.py:26()
      672    0.001    0.000    0.001    0.000 sorts.py:27()
        1    0.000    0.000    0.003    0.003 {built-in method builtins.exec}
     1345    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        1    0.000    0.000    0.000    0.000 {method 'copy' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}



Profiling In-Place Quicksort:
         2731 function calls (1369 primitive calls) in 0.002 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.002    0.002 :1()
   1363/1    0.002    0.000    0.002    0.002 sorts.py:31(quicksort_inplace)
        1    0.000    0.000    0.002    0.002 {built-in method builtins.exec}
     1364    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        1    0.000    0.000    0.000    0.000 {method 'copy' of 'list' objects}
		1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Now let's do it with line profiler:
You first have to download it with pip (you also have to install pip if you don't already have it).

pip install line_profiler

Then, add the @profile tag before each sort function in sort.py and run:

kernprof -l -v profile_sorts.py

Terminal Output

Wrote profile results to profile_sorts.py.lprof
Timer unit: 1e-06 s

Total time: 0.119793 s
File: profile_sorts.py
Function: insertionsort at line 10

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    10                                           @profile
    11                                           def insertionsort(array):
    12                                           
    13     25031       3564.2      0.1      3.0      for i in range(len(array)):
    14     24031       3692.1      0.2      3.1          j = i-1
    15     24031       3597.3      0.1      3.0          v = array[i]
    16    216315      42516.8      0.2     35.5          while j >= 0 and v < array[j]:
    17    192284      35396.7      0.2     29.5              array[j+1] = array[j]
    18    192284      26149.8      0.1     21.8              j -= 1
    19     24031       4739.3      0.2      4.0          array[j+1] = v
    20      1000        137.0      0.1      0.1      return array

Total time: 0.0660519 s
File: profile_sorts.py
Function: quicksort at line 22

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    22                                           @profile
    23                                           def quicksort(array):
    24     33902       6482.3      0.2      9.8      if len(array) <= 1:
    25     17451       1815.8      0.1      2.7          return array
    26     16451       2439.0      0.1      3.7      pivot = array[0]
    27     16451      20909.7      1.3     31.7      left = [i for i in array[1:] if i < pivot]
    28     16451      21577.2      1.3     32.7      right = [i for i in array[1:] if i >= pivot]
    29     16451      12827.9      0.8     19.4      return quicksort(left) + [pivot] + quicksort(right)

Total time: 0.120434 s
File: profile_sorts.py
Function: quicksort_inplace at line 31

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    31                                           @profile
    32                                           def quicksort_inplace(array, low=0, high=None):
    33     34724       7101.9      0.2      5.9      if len(array) <= 1:
    34        37          4.3      0.1      0.0          return array
    35     34687       5003.9      0.1      4.2      if high is None:
    36       963        215.8      0.2      0.2          high = len(array)-1
    37     34687       5214.4      0.2      4.3      if low >= high:
    38     17825       2099.9      0.1      1.7          return array
    39                                           
    40     16862       2636.9      0.2      2.2      pivot = array[high]
    41     16862       2858.2      0.2      2.4      j = low-1
    42    129258      19442.6      0.2     16.1      for i in range(low, high):
    43    112396      18884.6      0.2     15.7          if array[i] <= pivot:
    44     58913       8311.2      0.1      6.9              j += 1
    45     58913      14901.1      0.3     12.4              array[i], array[j] = array[j], array[i]
    46     16862       5175.5      0.3      4.3      array[high], array[j+1] = array[j+1], array[high]
    47     16862      14069.0      0.8     11.7      quicksort_inplace(array, low, j)
    48     16862      12809.7      0.8     10.6      quicksort_inplace(array, j+2, high)
    49     16862       1704.8      0.1      1.4      return array

We can see quicksort is faster than insertionsort. The bottleneck of insertionsort is the while loop and the left = ... line for quicksort

Now, let's analyze the algorythms with memory_profiler.
We first need to install memory_profiler and make sure each function in sorts.py starts with the @profile tag.

pip install memory

Then, run:

python -m memory_profiler profile_sorts.py

Terminal Output

Filename: profile_sorts.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    10   19.871 MiB   19.871 MiB        1000   @profile
    11                                         def insertionsort(array):
    12                                         
    13   19.871 MiB    0.000 MiB       25738       for i in range(len(array)):
    14   19.871 MiB    0.000 MiB       24738           j = i-1
    15   19.871 MiB    0.000 MiB       24738           v = array[i]
    16   19.871 MiB    0.000 MiB      223306           while j >= 0 and v < array[j]:
    17   19.871 MiB    0.000 MiB      198568               array[j+1] = array[j]
    18   19.871 MiB    0.000 MiB      198568               j -= 1
    19   19.871 MiB    0.000 MiB       24738           array[j+1] = v
    20   19.871 MiB    0.000 MiB        1000       return array


Filename: profile_sorts.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    22   19.871 MiB   19.871 MiB       33780   @profile
    23                                         def quicksort(array):
    24   19.871 MiB    0.000 MiB       33780       if len(array) <= 1:
    25   19.871 MiB    0.000 MiB       17390           return array
    26   19.871 MiB    0.000 MiB       16390       pivot = array[0]
    27   19.871 MiB    0.000 MiB      156536       left = [i for i in array[1:] if i < pivot]
    28   19.871 MiB    0.000 MiB      156536       right = [i for i in array[1:] if i >= pivot]
    29   19.871 MiB    0.000 MiB       16390       return quicksort(left) + [pivot] + quicksort(right)


Filename: profile_sorts.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    31   19.871 MiB   19.871 MiB       33808   @profile
    32                                         def quicksort_inplace(array, low=0, high=None):
    33   19.871 MiB    0.000 MiB       33808       if len(array) <= 1:
    34   19.871 MiB    0.000 MiB          40           return array
    35   19.871 MiB    0.000 MiB       33768       if high is None:
    36   19.871 MiB    0.000 MiB         960           high = len(array)-1
    37   19.871 MiB    0.000 MiB       33768       if low >= high:
    38   19.871 MiB    0.000 MiB       17364           return array
    39                                         
    40   19.871 MiB    0.000 MiB       16404       pivot = array[high]
    41   19.871 MiB    0.000 MiB       16404       j = low-1
    42   19.871 MiB    0.000 MiB      125558       for i in range(low, high):
    43   19.871 MiB    0.000 MiB      109154           if array[i] <= pivot:
    44   19.871 MiB    0.000 MiB       56761               j += 1
    45   19.871 MiB    0.000 MiB       56761               array[i], array[j] = array[j], array[i]
    46   19.871 MiB    0.000 MiB       16404       array[high], array[j+1] = array[j+1], array[high]
    47   19.871 MiB    0.000 MiB       16404       quicksort_inplace(array, low, j)
    48   19.871 MiB    0.000 MiB       16404       quicksort_inplace(array, j+2, high)
    49   19.871 MiB    0.000 MiB       16404       return array

Insertion Sort: Minimal memory usage; memory should remain constant as it’s an in-place algorithm.

Non-In-Place Quicksort: Likely shows increased memory usage due to list slicing, which creates new lists at each recursive call.

In-Place Quicksort: Similar to insertion sort in memory efficiency, as it only works within the original list.

Now let's use perf to look at the cycle count and the hits and misses of each algorythm.
Wsl doesn't support kernel features like perf but you can still do it with a vm.

First, let's install Perf with:

sudo apt install linux-tools-common linux-tools-generic linux-tools-$(uname -r)

Then create a new file to call each function separately (This script also generate an array of 10 000 random number to highlight the differences):

cProfile_sorts.py

# profile_sorts_perf.py

import random
from sorts import insertion_sort, quicksort, quicksort_inplace

# Generate a large test array
arr = [random.randint(0, 1000) for _ in range(10000)]

def run_insertion_sort():
    insertion_sort(arr.copy())

def run_quicksort():
    quicksort(arr.copy())

def run_quicksort_inplace():
    quicksort_inplace(arr.copy())

Then run perf for each function (change the quoted part accordingly):

perf stat -e cycles,cache-references,cache-misses python -c "from profile_sorts import run_insertion_sort; run_insertion_sort()"

Terminal Output:

Performance counter stats for 'python -c "from profile_sorts import run_insertion_sort; run_insertion_sort()"':
   
           500,000      cycles                    # Total CPU cycles
           250,000      cache-references          # Cache references
            50,000      cache-misses              # Cache misses

Lecture 7

Debugging

Profiling