CSC148 Slog

Saturday, March 29, 2014

Week 11: Sorting and Effeciency

There are many ways to sort a list. The sorting time will not vary a lot if the length of the list is small, however, with the increasing of the length, sorting time will differ significantly. This is because of the different complexity of each algorithm, which we use Big O( ) to measure.

The idea of Big O( ) is brand new to me. Big O( ) characterize a function's growth rate. For example, 2 n ** 2 and n ** 2 are both Big O(n**2) because when n doubles the running time will becomes 4 times longer for both cases. In other words, we don't need to care much about the constant, what we do need to care about is the main term.

When the size of length increases, the sorting time of Big O(1) will not increase because it is a constant time. From left to right, the increasing of sorting time will increase. When choosing the algorithm, we always want to consider the worst case instead of the best case. I think we should choose different algorithm depending on the input size. If the size is small, then the sorting time is basically the same and sometimes higher complexity algorithm can even be better. However, if the input size is really huge, we should probably give our priority to algorithms with lower complexity.

During lecture, Professor Danny introduced us some different sorting method. In order to make it easy to understand, he sorted the cards using different methods, which did help a lot. We compared four kinds of sorting algorithms in class and several other algorithms during lab. They are:

Selection Sort:
From Wikipedia: "The algorithm divides the input list into two parts: the sublist of items already sorted, which is built up from left to right at the front (left) of the list, and the sublist of items remaining to be sorted that occupy the rest of the list. Initially, the sorted sublist is empty and the unsorted sublist is the entire input list. The algorithm proceeds by finding the smallest (or largest, depending on sorting order) element in the unsorted sublist, exchanging it with the leftmost unsorted element (putting it in sorted order), and moving the sublist boundaries one element to the right."

worse case: O(n**2)
average case: O(n**2)
best case: O(n**2)
n ** 2 comes because to find item for each position, we need to compare the others to find the smallest, i.e. total comparisons: n-1 + n-2 + ...... + 1

Insertion Sort:
From Wikipedia: "Sorting is typically done in-place, by iterating up the array, growing the sorted list behind it. At each array-position, it checks the value there against the largest value in the sorted list (which happens to be next to it, in the previous array-position checked). If larger, it leaves the element in place and moves to the next. If smaller, it finds the correct position within the sorted list, shifts all the larger values up to make a space, and inserts into that correct position."

worse case: O(n**2)
average case: O(n**2)
best case: O(n)
The best case is a nearly sorted list with only the last two positions in reverse order. We just need n-1 comparisons and 1 swap.
The worst case: each step we need n - 1 comparisons and 1 swap. Total steps: n + n-1 + .... => O(n**2)

Quick Sort:

form Wikipedia:

"The steps are:

Pick an element, called a pivot, from the list.
Reorder the list so that all elements with values less than the pivot come before the pivot, while all elements with values greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation.
Recursively apply the above steps to the sub-list of elements with smaller values and separately to the sub-list of elements with greater values.

The base case of the recursion is lists of size zero or one, which never need to be sorted."

worse case: O(n**2)
average case: O(n lgn)
best case: O(n lgn)
Worse case comes from a sorted list, because the partition will not do anything for a sorted list.
Otherwise, petition will give us lgn, and in each partition, we need to compare n times, so it is (n lgn) times.

Merge Sort:

From Wikipedia:

"Conceptually, a merge sort works as follows:

Divide the unsorted list into n sublists, each containing 1 element (a list of 1 element is considered sorted).
Repeatedly merge sublists to produce new sorted sublists until there is only 1 sublist remaining. This will be the sorted list."

worse case: O(n lgn)
average case: O(n lgn)
best case: O(n lgn)
The lgn comes from the partition step, and the n is because in each partition, all items need to be compared.

Count Sort:

def count_sort(L):
    """silly functional sort.  See radix sort for something serious"""
    count_list = [0, ] * len(L)
    for i in L:
        count_list[i] += 1

    L1 = [] # blank slate!
    for i in range(len(count_list)):
        for j in range(count_list[i]):
            L1.append(i)
    return L1

For all cases, O(n), which means the running time is proportional to the length of the list.
However, this function only works for lists that none of the numbers in the list is great than the length of the list.
I think a small change of the code will make it work for more cases. Here is my code:

def count_sort(L):
    """silly functional sort.  See radix sort for something serious"""
    count_list = [0, ] * (max(L)+1)
    for i in L:
        count_list[i] += 1

    L1 = [] # blank slate!
    for i in range(len(count_list)):
        for j in range(count_list[i]):
            L1.append(i)
    return L1

Bubble Sort:
From Wikipedia:
"Starting from the beginning of the list, compare every adjacent pair, swap their position if they are not in the right order (the latter one is smaller than the former one). After each iteration, one less element (the last one) is needed to be compared until there are no more elements left to be compared."

best case: O(n**2)
worst case: O(n**2)
average case: O(n**2)
Actually, there is no difference between best case and worst case because before finishing the whole process, computer will never know whether this list is sorted. In each step, the computer needs to do comparisons and potential swaps. So, it is proportional to n**2 for all the cases.

The posted code for every algorithm is very helpful. Studying the code help me further understand how they worked. Also, some codes contains recursion which helped me reviewed recursion a little bit.

In real life, to choosing an algorithm, I think we also need to consider about memory space. I guess we will also study about this later.

Saturday, March 22, 2014

Week 8, 9, 10

Main topics:

LinkedLists(W8):
LinkedList is a brand new topic for me. It is a data structure. Every linked list has an item and a link. The link can either be none or a new linked list. In order to modify the linked list, (by modifying, I mean to add or delete items to the linked list), I need to first find the right place, and then modify the whole list by changing self.item and self.link simultaneously.
The most difficult part is the first step. I need to think through the whole process and find the right spot for changing. The thinking process is very time consuming. If I am clear of the process, coding is not that difficult.

Binary Search Tree(W9):
Binary Search Tree is an extension of LinkedList. The rough idea of that is: every BST has a item and two children(either or both of them can be none). If not None, the left children are less than the item and the right children are bigger than the item.
In order to modify the BSTree, (by modifying, I mean insert a value to its right place or delete the value), I need to do the exact thing as I do in Linked List. I need to first find the right place to modify. The more challenging thing is when I need to delete an item, I also need to consider restructure the tree a little bit. When a inner node is deleted, I need to find the biggest item in the left tree to be this new inner node and connect the other nodes properly.
When we want to search some value in the binary search tree, it will save us a lot of time. Because for each recursion step, we only need to consider half of the tree. So Binary Search Tree does an amazing job in reducing the searching time.

Some Sorting Methods(W10):
Reviewed some sorting method learned in CSC108 including insertion sort and selection sort. The idea of Big O is new to me. Insertion sort and selection sort are both quadratic methods. When the size of data is small, it performs well. However with the increase of the data size, the time it spends will increase quadratically. Thus when the size is big, it doesn't perform very well.

Assignment & Exercise:
Assignment 2I (W8):
Assignment 2I is to design a class. The key part is to create subtrees and supertrees to avoid repeated calculations.
Exercise 3 (W9):
1-We are asked to produce a tree if its preorder and inorder is given. The fun part of it is to properly using recursions.
2-We are asked to return the sum of the keys from the root to the deepest leaf in t.
Assignment 2I (W10)
Assignment 2II:
Assignment 2II is harder than what I have thought. In fact, every function needs intellectual work. I need to first write down the base cases and then think about the recursion process. Prof Danny's explanation was really helpful. After knowing the rough idea of the method and the base cases, writing recursive code is easy. PS, we spend more time on 'debug' than write code in the begging. It took us 2 days to fix bugs.

Sunday, March 2, 2014

Week 7: Recursions

Recursion is a brand new topic for me.

In CSC108, we try to use for loops and while loops. Some of recursions can be substituted by loops, however, most of them cannot.

Recursion is to define some function by calling itself inside the function body. We can do this because the process goes repeatedly for every item. From my experience so far, I think recursion is extremely useful when dealing with nested list, nested trees, and any other nested ADTs.

We first learned about nested list recursions. For the nested_depth function, without recursion, I cannot think a proper way to call the same function from the most inner list to the most outer list. I tried to write a loop for rec_max, although it can be achieved, I need to write more lines of codes and create more variable for storing things properly, which makes it not as efficient as recursion. In contrast, recursion functions are neat. In terms of what we learned, the code will usually be one line which sometimes contains list comprehension.

Also, recursion is a foundation for some ADTs such like representation of trees, linked implementations of trees and linked lists. Recursions are needed when we define a method of that class (even __init__ method can use recursion). In this case, recursion is extremely useful, since we have no idea whether this class will based on lists or tuples or sets, it may just be some abstract data type. We cannot use indexes to refer to the item we want, which makes loop almost impossible. The examples in linked implementations of trees is extremely useful for me. It contains examples of using and not using list comprehensions, also examples of adding 1 inside or outside the list comprehension. And examples of 'max' and 'sum' help me deepen my understanding (max([])-> Error, sum([])->0) as well as tell me some tricks to avoid Error.

Also, recursion is really useful when we need to find the solution of a recursive math problem like permutations and Tower of Hanoi .

In terms of permutation, if I want to do it without recursion, by hand, for example, it will take me a lot of time. I will probably not be able to avoid repeated items, and probably some permutations might not be included. Recursion gives me a clear way to make sure that all the permutations will be included and none of them will repeat.

As for Tower of Hanoi, recursion did a fantastic job. It can help us to decide which cheese goes to which stool so that it is under the rule. What's more, it can tell us the most efficient way to move the cheeses, i.e. to move from one stool to another with the least steps. And we can know what exactly each move is. What makes this fantastic is that we do not need to think every step repeatedly, only a recursion relationship will solve all that for us.

In terms of how to use a recursion, I think we need to know two things: base case and recursion relationship.

When I first learn recession, I find it difficult to know what base case is. I am not used to the thinking process at the beginning. I tried to picture the whole process and end up in a mess. Now, I will first find the recursion relationship, and then think about the base case. Base case will usually be the "empty case". And I will call recursions for other cases. As for tracing recursions, I need to go from the most inner to the most outside, and use the result I have got earlier instead of further tracing it.

In conclusion, recursion is a really usefully technic for recursive things and I believe it is a foundation for what we will learn in the future of this class. I will go through the examples in linked implementations of trees several more times to ensure I can fully understand it.

Sunday, February 16, 2014

Week 6: ~Trees~

At the beginning of this week, I finally finish my assignment 1~ :) It is very challenging to use recursion to find i, but I finally find a way to do it, that is to record the i and move at the same time.

Lab was about test cases. I forget how to write a standard test case, my partner and I spend some time google them. After reading several examples, we figured them out. After lab, I did some review on the test case.

Lectures was about recursion. We learned more on recursion this week. The structure of trees is a brand-new topic. We learnt about pre-order traversal, in-order traversal, post-order traversal, and how to use recursion and trees together. Reading and understanding code seems easy to me, but I need more practice on writing recursions on my own.

Also, test 1 is coming soon. I want to spend some time to review the stuff I have already learnt. So far, I enjoy this course very much :)

Saturday, February 8, 2014

Week 5: Tower of Hanoi and some recursions

This week, I put a lot of effort to assignment 1. Tower of Hanoi is an interesting question, and I am surprised that computer can help us to solve it in such a clever way (if you told the computer to do the right thing).

Our job is to program a game with four stools. The most challenging part is to find the least moves, which will involve recursion. In Monday's lecture, we got some idea of three tools recursion. In Wednesday's lecture, we learnt some thing about global and local values. If a function is inside anther function, then the values it produces will probably be local without using "global". It might be useful in assignment 1 I guess.

Labs are about tracing recursions. I think tracing is easier than actually writing the code. Because, to me, tracing is the first step of writing codes. Before writing codes, I prefer to write some examples, which is exactly what tracing will be.

Although the course is getting harder, I found it really interesting.

Friday, January 31, 2014

Week 4: Exercise 2 and Recursions~

I started this week's CSC148 by doing exercise 2. Compared to Exercise1, this exercise needs more thinking. Without some understanding of exception, I don't think I can do it. Actually, this exercise is a good opportunity to review and further think about exception. I struggled with what would happen if the input is a list. I think I have done both the old version and the updated version correctly. The old version is trickier, I spent a relatively long time to think of a solution for that.

Lectures were about recursion. I found that recursion is a strange thing to me: when I look at the code someone has already finished (like the examples from the lecture), I can understand them easily and analyze them correctly, however, when I try to write the code by myself, I struggled a lot. Probably, I need more practice on recursions. I like the "turtle" example, because, we can actually see what happens from the pictures.

During the lab, my partner and I reviewed testing modules. We spent quite a lot of time doing testing. But I think it is worthwhile because writing some test functions will save us a lot of time. We don't need to test every single case again and again.

Assignment 1 is coming, I am looking forward to it.

Saturday, January 25, 2014

Week 3: Object-Oriented Programming

From the first week till now, our main focus of this course is Object-Oriented Programming.

This week, we reviewed what we have learnt about "class", and then, we further studied subclass and superclass. The reason for us to derived a new class from a base class instead of creating a new class directly is that programmers have a preference for inheritance. By inheriting, we can avoid writing the same code more than once. For a method in the subclass which has the same name as it does in superclass, we can use the same method as base class, we can also override and alter it. The readings on inheritance are supplement for lectures and they really helped me to further understand the meaning of using inheritance.

We also learnt about Exception. This is the most confusing part for me. Prof Heap encouraged us to try the examples by ourselves in order to better understand the material. After lecture, I played with the exception example, I tried all the possibilities listed in the example. I think I get some sense of it, but I still cannot explicitly explain why each result happens. So I decided to go through the reading on Exception. The reading provides me with more examples, but I am still a little confused. I tried python help() on except, there is a very long and detailed explanation for exception and it is really helpful. We often use try & except together. When python find a matching of excepting clause (the match may come from its super case, this is the tricky point.), the code inside except will be evaluated, otherwise, python will continue to go to the next except clause till the end.

In conclusion, this week's material is interesting but a little challenging for me. I am going to start exercise 2 soon, hopefully I will finish it this weekend :)