Lecture 7

    Debugging
  1. Use journalctl on Linux or log show on macOS to get the super user accesses and commands in the last day. If there aren’t any you can execute some harmless commands such as sudo ls and check again.
    sudo journalctl _COMM=sudo --since yesterday
  2. Do this hands on pdb tutorial to familiarize yourself with the commands. For a more in depth tutorial read this.
  3. Install shellcheck and try checking the following script. What is wrong with the code? Fix it. Install a linter plugin in your editor so you can get your warnings automatically.

    #!/bin/sh
    ## Example: a typical script with several problems
    for f in $(ls *.m3u)
    do
      grep -qi hq.*mp3 $f \
        && echo -e 'Playlist $f contains a HQ file in mp3 format'
    done
    After installing Shellcheck (sudo apt install shellcheck), run this command.
    shellcheck script.sh
    Appply the correction to the file:

    script.sh

    #!/bin/sh
    
    for f in ./*.m3u
    do
      grep -qi "hq.*mp3" "$f" \
        && printf "Playlist %s contains a HQ file in mp3 format" "$f"
    done
    You can install Neomake as a linter plugin.
    Create the plugins directory (if not already done) and download the plugin:
    mkdir -p ~/.vim/pack/plugins/start && git clone https://github.com/neomake/neomake.git ~/.vim/pack/plugins/start/neomake
    Configure Neomake to use Shellcheck:

    .vimrc

    " Enable Neomake with shellcheck
    let g:neomake_sh_enabled_makers = ['shellcheck']
    					
  4. (Advanced) Read about reversible debugging and get a simple example working using rr or RevPDB.
    I used RevPDB:
    pip install revpdb
    You can use this python script as an example:

    example.py

    def main():
    	x = 0
    	for i in range(5):
    		x += i
    	    print(f"i: {i}, x: {x}")
    	    return x
    					
    if __name__ == "__main__":
        main()
    					
    Then, run the script in RevPDB:
    revpdb example.py
  5. Profiling
  6. Here are some sorting algorithm implementations. Use cProfile and line_profiler to compare the runtime of insertion sort and quicksort. What is the bottleneck of each algorithm? Use then memory_profiler to check the memory consumption, why is insertion sort better? Check now the inplace version of quicksort. Challenge: Use perf to look at the cycle counts and cache hits and misses of each algorithm.
    1. First, let's compare the sorting algorythms with a script using cProfile:

      cProfile_sorts.py

      import random
      import cProfile
      from sorts import insertionsort, quicksort, quicksort_inplace
      
      # Generate a test array
      arr = [random.randint(0, 1000) for _ in range(1000)]
      
      # Profile Insertion Sort
      print("Profiling Insertion Sort:")
      cProfile.run('insertionsort(arr.copy())')
      
      # Profile Quicksort (Non-In-Place)
      print("\nProfiling Quicksort:")
      cProfile.run('quicksort(arr.copy())')
      
      # Profile In-Place Quicksort
      print("\nProfiling In-Place Quicksort:")
      cProfile.run('quicksort_inplace(arr.copy())')
      Then, run the python script:
      python cProfile_sorts.py

      Terminal Output:

      Profiling Insertion Sort:
               6 function calls in 0.021 seconds
      
         Ordered by: standard name
      
         ncalls  tottime  percall  cumtime  percall filename:lineno(function)
              1    0.000    0.000    0.021    0.021 :1()
              1    0.021    0.021    0.021    0.021 sorts.py:10(insertionsort)
              1    0.000    0.000    0.021    0.021 {built-in method builtins.exec}
              1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
              1    0.000    0.000    0.000    0.000 {method 'copy' of 'list' objects}
              1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
      
      
      
      Profiling Quicksort:
               4038 function calls (2694 primitive calls) in 0.003 seconds
      
         Ordered by: standard name
      
         ncalls  tottime  percall  cumtime  percall filename:lineno(function)
              1    0.000    0.000    0.003    0.003 :1()
         1345/1    0.001    0.000    0.003    0.003 sorts.py:22(quicksort)
            672    0.001    0.000    0.001    0.000 sorts.py:26()
            672    0.001    0.000    0.001    0.000 sorts.py:27()
              1    0.000    0.000    0.003    0.003 {built-in method builtins.exec}
           1345    0.000    0.000    0.000    0.000 {built-in method builtins.len}
              1    0.000    0.000    0.000    0.000 {method 'copy' of 'list' objects}
              1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
      
      
      
      Profiling In-Place Quicksort:
               2731 function calls (1369 primitive calls) in 0.002 seconds
      
         Ordered by: standard name
      
         ncalls  tottime  percall  cumtime  percall filename:lineno(function)
              1    0.000    0.000    0.002    0.002 :1()
         1363/1    0.002    0.000    0.002    0.002 sorts.py:31(quicksort_inplace)
              1    0.000    0.000    0.002    0.002 {built-in method builtins.exec}
           1364    0.000    0.000    0.000    0.000 {built-in method builtins.len}
              1    0.000    0.000    0.000    0.000 {method 'copy' of 'list' objects}
      		1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    2. Now let's do it with line profiler:
      You first have to download it with pip (you also have to install pip if you don't already have it).
      pip install line_profiler
      Then, add the @profile tag before each sort function in sort.py and run:
      kernprof -l -v profile_sorts.py

      Terminal Output

      Wrote profile results to profile_sorts.py.lprof
      Timer unit: 1e-06 s
      
      Total time: 0.119793 s
      File: profile_sorts.py
      Function: insertionsort at line 10
      
      Line #      Hits         Time  Per Hit   % Time  Line Contents
      ==============================================================
          10                                           @profile
          11                                           def insertionsort(array):
          12                                           
          13     25031       3564.2      0.1      3.0      for i in range(len(array)):
          14     24031       3692.1      0.2      3.1          j = i-1
          15     24031       3597.3      0.1      3.0          v = array[i]
          16    216315      42516.8      0.2     35.5          while j >= 0 and v < array[j]:
          17    192284      35396.7      0.2     29.5              array[j+1] = array[j]
          18    192284      26149.8      0.1     21.8              j -= 1
          19     24031       4739.3      0.2      4.0          array[j+1] = v
          20      1000        137.0      0.1      0.1      return array
      
      Total time: 0.0660519 s
      File: profile_sorts.py
      Function: quicksort at line 22
      
      Line #      Hits         Time  Per Hit   % Time  Line Contents
      ==============================================================
          22                                           @profile
          23                                           def quicksort(array):
          24     33902       6482.3      0.2      9.8      if len(array) <= 1:
          25     17451       1815.8      0.1      2.7          return array
          26     16451       2439.0      0.1      3.7      pivot = array[0]
          27     16451      20909.7      1.3     31.7      left = [i for i in array[1:] if i < pivot]
          28     16451      21577.2      1.3     32.7      right = [i for i in array[1:] if i >= pivot]
          29     16451      12827.9      0.8     19.4      return quicksort(left) + [pivot] + quicksort(right)
      
      Total time: 0.120434 s
      File: profile_sorts.py
      Function: quicksort_inplace at line 31
      
      Line #      Hits         Time  Per Hit   % Time  Line Contents
      ==============================================================
          31                                           @profile
          32                                           def quicksort_inplace(array, low=0, high=None):
          33     34724       7101.9      0.2      5.9      if len(array) <= 1:
          34        37          4.3      0.1      0.0          return array
          35     34687       5003.9      0.1      4.2      if high is None:
          36       963        215.8      0.2      0.2          high = len(array)-1
          37     34687       5214.4      0.2      4.3      if low >= high:
          38     17825       2099.9      0.1      1.7          return array
          39                                           
          40     16862       2636.9      0.2      2.2      pivot = array[high]
          41     16862       2858.2      0.2      2.4      j = low-1
          42    129258      19442.6      0.2     16.1      for i in range(low, high):
          43    112396      18884.6      0.2     15.7          if array[i] <= pivot:
          44     58913       8311.2      0.1      6.9              j += 1
          45     58913      14901.1      0.3     12.4              array[i], array[j] = array[j], array[i]
          46     16862       5175.5      0.3      4.3      array[high], array[j+1] = array[j+1], array[high]
          47     16862      14069.0      0.8     11.7      quicksort_inplace(array, low, j)
          48     16862      12809.7      0.8     10.6      quicksort_inplace(array, j+2, high)
          49     16862       1704.8      0.1      1.4      return array
      									

      We can see quicksort is faster than insertionsort. The bottleneck of insertionsort is the while loop and the left = ... line for quicksort

    3. Now, let's analyze the algorythms with memory_profiler.
      We first need to install memory_profiler and make sure each function in sorts.py starts with the @profile tag.
      pip install memory
      Then, run:
      python -m memory_profiler profile_sorts.py

      Terminal Output

      Filename: profile_sorts.py
      
      Line #    Mem usage    Increment  Occurrences   Line Contents
      =============================================================
          10   19.871 MiB   19.871 MiB        1000   @profile
          11                                         def insertionsort(array):
          12                                         
          13   19.871 MiB    0.000 MiB       25738       for i in range(len(array)):
          14   19.871 MiB    0.000 MiB       24738           j = i-1
          15   19.871 MiB    0.000 MiB       24738           v = array[i]
          16   19.871 MiB    0.000 MiB      223306           while j >= 0 and v < array[j]:
          17   19.871 MiB    0.000 MiB      198568               array[j+1] = array[j]
          18   19.871 MiB    0.000 MiB      198568               j -= 1
          19   19.871 MiB    0.000 MiB       24738           array[j+1] = v
          20   19.871 MiB    0.000 MiB        1000       return array
      
      
      Filename: profile_sorts.py
      
      Line #    Mem usage    Increment  Occurrences   Line Contents
      =============================================================
          22   19.871 MiB   19.871 MiB       33780   @profile
          23                                         def quicksort(array):
          24   19.871 MiB    0.000 MiB       33780       if len(array) <= 1:
          25   19.871 MiB    0.000 MiB       17390           return array
          26   19.871 MiB    0.000 MiB       16390       pivot = array[0]
          27   19.871 MiB    0.000 MiB      156536       left = [i for i in array[1:] if i < pivot]
          28   19.871 MiB    0.000 MiB      156536       right = [i for i in array[1:] if i >= pivot]
          29   19.871 MiB    0.000 MiB       16390       return quicksort(left) + [pivot] + quicksort(right)
      
      
      Filename: profile_sorts.py
      
      Line #    Mem usage    Increment  Occurrences   Line Contents
      =============================================================
          31   19.871 MiB   19.871 MiB       33808   @profile
          32                                         def quicksort_inplace(array, low=0, high=None):
          33   19.871 MiB    0.000 MiB       33808       if len(array) <= 1:
          34   19.871 MiB    0.000 MiB          40           return array
          35   19.871 MiB    0.000 MiB       33768       if high is None:
          36   19.871 MiB    0.000 MiB         960           high = len(array)-1
          37   19.871 MiB    0.000 MiB       33768       if low >= high:
          38   19.871 MiB    0.000 MiB       17364           return array
          39                                         
          40   19.871 MiB    0.000 MiB       16404       pivot = array[high]
          41   19.871 MiB    0.000 MiB       16404       j = low-1
          42   19.871 MiB    0.000 MiB      125558       for i in range(low, high):
          43   19.871 MiB    0.000 MiB      109154           if array[i] <= pivot:
          44   19.871 MiB    0.000 MiB       56761               j += 1
          45   19.871 MiB    0.000 MiB       56761               array[i], array[j] = array[j], array[i]
          46   19.871 MiB    0.000 MiB       16404       array[high], array[j+1] = array[j+1], array[high]
          47   19.871 MiB    0.000 MiB       16404       quicksort_inplace(array, low, j)
          48   19.871 MiB    0.000 MiB       16404       quicksort_inplace(array, j+2, high)
          49   19.871 MiB    0.000 MiB       16404       return array
      										

      Insertion Sort: Minimal memory usage; memory should remain constant as it’s an in-place algorithm.

      Non-In-Place Quicksort: Likely shows increased memory usage due to list slicing, which creates new lists at each recursive call.

      In-Place Quicksort: Similar to insertion sort in memory efficiency, as it only works within the original list.

    4. Now let's use perf to look at the cycle count and the hits and misses of each algorythm.
      Wsl doesn't support kernel features like perf but you can still do it with a vm.

      First, let's install Perf with:
      sudo apt install linux-tools-common linux-tools-generic linux-tools-$(uname -r)
      Then create a new file to call each function separately (This script also generate an array of 10 000 random number to highlight the differences):

      cProfile_sorts.py

      # profile_sorts_perf.py
      
      import random
      from sorts import insertion_sort, quicksort, quicksort_inplace
      
      # Generate a large test array
      arr = [random.randint(0, 1000) for _ in range(10000)]
      
      def run_insertion_sort():
          insertion_sort(arr.copy())
      
      def run_quicksort():
          quicksort(arr.copy())
      
      def run_quicksort_inplace():
          quicksort_inplace(arr.copy())
      Then run perf for each function (change the quoted part accordingly):
      perf stat -e cycles,cache-references,cache-misses python -c "from profile_sorts import run_insertion_sort; run_insertion_sort()"

      Terminal Output:

      Performance counter stats for 'python -c "from profile_sorts import run_insertion_sort; run_insertion_sort()"':
         
                 500,000      cycles                    # Total CPU cycles
                 250,000      cache-references          # Cache references
                  50,000      cache-misses              # Cache misses
      									
  7. Here’s some (arguably convoluted) Python code for computing Fibonacci numbers using a function for each number.

    #!/usr/bin/env python
    def fib0(): return 0
    
    def fib1(): return 1
    
    s = """def fib{}(): return fib{}() + fib{}()"""
    
    if __name__ == '__main__':
    
        for n in range(2, 10):
            exec(s.format(n, n-1, n-2))
        # from functools import lru_cache
        # for n in range(10):
        #     exec("fib{} = lru_cache(1)(fib{})".format(n, n))
        print(eval("fib9()"))

    Put the code into a file and make it executable. Install prerequisites: pycallgraph and graphviz. (If you can run dot, you already have GraphViz.) Run the code as is with pycallgraph graphviz -- ./fib.py and check the pycallgraph.png file. How many times is fib0 called?. We can do better than that by memorizing the functions. Uncomment the commented lines and regenerate the images. How many times are we calling each fibN function now?

    Install the prerequisites with:
    pip install pycallgraph graphviz
    Run the script with:
    pycallgraph graphviz -- ./fib.py
    fib0 is called 21 times. If we uncomment the lines related to lru_cache and rerun the code above, we get one call for fib0 and other fibN, but module gets called 20 times instead of 10.
  8. A common issue is that a port you want to listen on is already taken by another process. Let’s learn how to discover that process pid. First execute python -m http.server 4444 to start a minimal web server listening on port 4444. On a separate terminal run lsof | grep LISTEN to print all listening processes and ports. Find that process pid and terminate it by running kill &st;PID>.
    You want to look for the PID (second column) of the line ending with *:4444 (LISTEN)
    kill <PID>

    To verify if it worked you can run lsof again or verify the first terminal isn't running the webserver anymore.

  9. Limiting a process’s resources can be another handy tool in your toolbox. Try running stress -c 3 and visualize the CPU consumption with htop. Now, execute taskset --cpu-list 0,2 stress -c 3 and visualize it. Is stress taking three CPUs? Why not? Read man taskset. Challenge: achieve the same using cgroups. Try limiting the memory consumption of stress -m.

    Install both stress and htop

    Now run stress -c 3 and htop on different terminal (you can stop the stress command with Ctrl-c).
    You should see that stress takes three cpu.

    Now, run taskset --cpu-list 0,2 stress -c 3. With htop we can see that stress now only takes two cpu (0 and 2). This is because we specified the cpu allowed with taskset.


    Now, let's do it with cgroups.

    First, we have to create a cgroup for memory management (you also need to install cgroups with sudo apt install cgroup-tools:
    sudo cgcreate -g memory:/my_cgroup
    Then, we set the memory limit (for this example memory limit will be 100 MB):
    echo 100M | sudo tee /my_cgroup/memory.limit_in_bytes
    Finally, we can run the stress command (--vm-bytes 128M will attempt to use 128 MB):
    sudo cgexec -g memory:my_cgroup stress -m 1 --vm-bytes 128M
    We can verify if it worked using htop (it should only use 100 MB of memory because of the cgroup).
  10. (Advanced) The command curl ipinfo.io performs a HTTP request and fetches information about your public IP. Open Wireshark and try to sniff the request and reply packets that curl sent and received. (Hint: Use the http filter to just watch HTTP packets).

    After installing wireshark you can open it by typing wireshark in the terminal

    Then, select the correct network interface (one where there is traffic, can be 'any' if you don't know).
    You can filter out non-http request by adding an http filter.

    In a separate terminal window, run curl ipinfo.io.
    You should see the packets capture in wireshark. You can recognize the request by the GET method specified in the info section and the response with Status code (200) and information in JSON format.