Sorting Algorithms
Working with Data Structures and manipulating data.
import random
numbers = []
for i in range(10):
numbers.append(random.randint(0,100))
print("Random List")
print(numbers)
Warm Up
Discuss with a partner... What are some strategies you would use to sort this list? (Don't worry about writing code for now)
- Find the smallest element. Move the smallest element to the 0th index, and move the element on the 0th index to the location of the smallest element. Repeat this process until the list is sorted.
Explore
Get into groups of 3
We will be focusing on 4 algorithms today.
We will look at the first one together, Bubble Sort
What is happening with this sort?
In your groups you will each choose to be an expert on a sorting algorithm. Merge, Selection, and Insertion. Take about 5 minutes to read about your algorithm (Geek for Geeks linked below) and be ready to explain it to your other group members.
Practice
[75, 17, 46, 80, 67, 45, 69, 79, 40, 0]
How would you sort this list with...
- Bubble Sort
- Selection Sort
Explain.
-
Bubble sort:Compare 75 with 17. Swap 75 and 17. The list looks like: [17, 75, 46, 80, 67, 45, 69, 79, 40, 0] Compare 75 with 46. Swap them. The list looks like: [17, 46, 75, 80, 67, 45, 69, 79, 40, 0]
Compare 75 and 80. Since 75 is less than 80, the two elements are not swapped.
Compare 80 and 67. Since 80 is greater than 67, swap the two elements.
Repeat this process until the list looks like this: [17, 75, 46, 67, 45, 69, 79, 40, 0, 80]
Start at the beginning of the list and repeat the process.
-
Selection sort: Find the smallest element in the list, which is 0. Swap 0 and 75. The list looks like: [0, 17, 46, 80, 67, 45, 69, 79, 40, 75]
Find the smallest element in the list, excluding the first element. Keep 17 at the first index.
Find the smallest element in the list, excluding the first and second element. Swap 46 and 40: [0, 17, 40, 80, 67, 45, 69, 79, 46, 75]
[88, 39, 53, 39, 58, 43, 74, 81, 71, 51]
How would you sort this list with...
-
Merge Sort
Note: Middle = (left + right)/2
Split the list in half (first list starts from 0th index and ends at 4th index, second list starts from 5th index and ends at 9th index). The two lists look like this: [88, 39, 53, 39, 58] and [43, 74, 81, 71, 51].
Continue splitting the lists in half until the elements end up like this:
[88] [39], [53], [39] [58], [43] [74], [81], [71] [51]
Sort the elements and combine them:
[39, 88], [53], [39, 58], [43, 74], [81], [51, 71]
[39, 53, 88], [39, 58], [43, 74, 81], [51, 71]
[39, 39, 53, 59, 99], [43, 51, 71, 74, 81]
[39, 39, 43, 51, 53, 59, 71, 74, 91, 99]
-
Insertion Sort
[88, 39, 53, 39, 58, 43, 74, 81, 71, 51]
Compare 39 and 88. Since 88 is greater than 39, swap them: [39, 88, 53, 39, 58, 43, 74, 81, 71, 51]
Move 53 into the correct position: [39, 53, 88, 39, 58, 43, 74, 81, 71, 51]
Then move 39 in the correct position and repeat.
Explain.
import nltk
import random
nltk.download('words') # Download the word list (only required once)
from nltk.corpus import words
english_words = words.words()
#print(len(english_words)) # Prints the number of words in the list
# You can now use the 'english_words' list in your code
words = []
for i in range(10):
words.append(english_words[random.randint(0,len(english_words))])
print("Random List")
print(words)
['bareheadedness', 'horizon', 'oiticica', 'photochloride', 'reapable', 'rosin', 'stylomaxillary', 'unperused', 'unprefined' 'unsurgical']
Discuss
Answer the following with your group.
- When should you use each algorithm? What makes an algorithm the right choice?
-
Given the following lists...
-
[0, 2, 6, 4, 8, 10]
Insertion, because the list is mostly sorted, so each element is only compared with the one next to it.
-
[Elephant, Banana, Cat, Dog, Apple]
Selection, because it finds the smallest element and moves it to the front of the list. Bubble would use too many swaps, while insertion would also need multiple swaps because the smallest element is at the end of the list, which means that the element needs to be swapped multiple times before it can reach the front of the list.
-
[29, 13, 83, 47, 32, 78, 100, 60, 65, 15, 24, 9, 40, 68, 53, 8, 90, 58, 39, 32, 34, 91, 74, 94, 49, 87, 34, 87, 23, 17, 27, 2, 38, 58, 84, 15, 9, 46, 74, 40, 44, 8, 55, 28, 81, 92, 81, 88, 53, 38, 19, 21, 9, 54, 21, 67, 3, 41, 3, 74, 13, 71, 70, 45, 5, 36, 80, 64, 97, 86, 73, 74, 94, 79, 49, 32, 20, 68, 64, 69, 1, 77, 31, 56, 100, 80, 48, 75, 85, 93, 67, 57, 26, 56, 43, 53, 59, 28, 67, 50]
Merge, because it has the best time complexity between the four sorts (O(n log(n))). All of the other sorts would require a large number of swaps. Select the algorithm you believe is best for each, explain.
-
HACKS
Provided below is a Bubble Sort Algorithm sorting a list of dictionaries based off of selected key.
-
Now it's time to do some coding...
-
Run code and then research and answer these questions...
-
Is a list and/or dictionary in python considered a primitive or collection type? Why?
Collection type, because it stores values into a "collection". A primitive includes data types such as integer or booleans.
-
Is the list passed into bubble sort "pass-by-value" or "pass-by-reference? Describe why in relation to output.
pass-by-value, because the values of the list changes.
-
-
Implement new cell(s) and/or organize cells to do the following.
- Create your own list
- Use your expertise sorting algorithm (selection, insertion, merge). Note, I got my bubble sort from Geek for Geeks and made modifications. Each student in a group should have a unique algorithm.
- Test your list with my bubble sort
- Test my list with your new sort
- Research analysis on sorting:comparisons, swaps, time. Build this into your hacks. - Find a better way to print the data, key first, then other elements in viewable form.
Use the code below to help guide your adventure
"""
* Creator: Nighthawk Coding Society
Bubble Sort of a List with optimizations
"""
import time
# bubble sorts a list of dictionaries, base off of provided key
def bubbleSort(list, key):
comparison = 0
swap = 0
n = len(list) - 1 # list are indexed 0 to n-1, len is n
# Traverse through list with i index
for i in range(n):
swapped = False # optimize code, so it exits if now swaps on inner loop
# Inner traversal using j index
for j in range(n-i): # n-i as positions on right are in order in bubble
# Swap if the element KeyN is greater KeyN1
keyN = list[j].get(key)
keyN1 = list[j+1].get(key)
if keyN > keyN1:
swap += 1
comparison += 1
swapped = True
list[j], list[j + 1] = list[j + 1], list[j] # single line swap
if not swapped: # if no swaps on inner pass, list is sorted
return (comparison, swap) # exit function
return (comparison, swap)
if __name__ == "__main__":
# list/dictionary sample
list_of_people = [
{"name": "Risa", "age": 18, "city": "New York"},
{"name": "John", "age": 63, "city": "Eugene"},
{"name": "Shekar", "age": 18, "city": "San Francisco"},
{"name": "Ryan", "age": 21, "city": "Los Angeles"}
]
# assuming uniform keys, pick 1st row as source of keys
key_row = list_of_people[0]
# print list as defined
print("Original")
print(list_of_people)
start = time.time()
for key in key_row: # finds each key in the row
print(key)
comparison, swap = bubbleSort(list_of_people, key) # sort list of people
print(list_of_people)
print("Number of comparisons: " + str(comparison))
print("Number of swaps: " + str(swap))
end = time.time()
print("Time elapsed: " + str(end-start))
"""
* Creator: Nighthawk Coding Society
Bubble Sort of a List with optimizations
"""
# bubble sorts a list of dictionaries, base off of provided key
def insertionSort(list, key):
comparison = 0
swap = 0
# Traverse through list with i index
for i in range(1, len(list)):
value = str(list[i].get(key))
valueOriginal = list[i]
j = i - 1
while (j >= 0) and (value < str(list[j].get(key))):
comparison += 1
list[j + 1] = list[j]
swap += 1
j -= 1
list[j + 1] = valueOriginal
swap += 1
return (comparison, swap)
if __name__ == "__main__":
# list/dictionary sample
list_of_people = [
{"name": "Risa", "age": 18, "city": "New York"},
{"name": "John", "age": 63, "city": "Eugene"},
{"name": "Shekar", "age": 18, "city": "San Francisco"},
{"name": "Ryan", "age": 21, "city": "Los Angeles"}
]
# assuming uniform keys, pick 1st row as source of keys
key_row = list_of_people[0]
# print list as defined
print("Original")
print(list_of_people)
start = time.time()
for key in key_row: # finds each key in the row
print(key)
comparison, swap = insertionSort(list_of_people, key) # sort list of people
print(list_of_people)
print("Number of comparisons: " + str(comparison))
print("Number of swaps: " + str(swap))
end = time.time()
print("Time elapsed: " + str(end-start))
"""
* Creator: Nighthawk Coding Society
Bubble Sort of a List with optimizations
"""
import time
# bubble sorts a list of dictionaries, base off of provided key
def bubbleSort(list, key):
comparison = 0
swap = 0
n = len(list) - 1 # list are indexed 0 to n-1, len is n
# Traverse through list with i index
for i in range(n):
swapped = False # optimize code, so it exits if now swaps on inner loop
# Inner traversal using j index
for j in range(n-i): # n-i as positions on right are in order in bubble
# Swap if the element KeyN is greater KeyN1
keyN = list[j].get(key)
keyN1 = list[j+1].get(key)
if keyN > keyN1:
swap += 1
comparison += 1
swapped = True
list[j], list[j + 1] = list[j + 1], list[j] # single line swap
if not swapped: # if no swaps on inner pass, list is sorted
return (comparison, swap) # exit function
return (comparison, swap)
if __name__ == "__main__":
# list/dictionary sample
list_of_people = [
{"name": "Risa", "age": 18, "city": "New York"},
{"name": "John", "age": 63, "city": "Eugene"},
{"name": "Shekar", "age": 18, "city": "San Francisco"},
{"name": "Ryan", "age": 21, "city": "Los Angeles"}
]
# assuming uniform keys, pick 1st row as source of keys
key_row = list_of_people[0]
# print list as defined
start = time.time()
for key in key_row: # finds each key in the row
print(key)
comparison, swap = bubbleSort(list_of_people, key) # sort list of people
for person in list_of_people:
print("Name: " + str(person["name"]))
print("Age: " + str(person["age"]))
print("City: " + str(person["city"]))
print("\n")
print("Number of comparisons: " + str(comparison))
print("Number of swaps: " + str(swap))
end = time.time()
print("Time elapsed: " + str(end-start))