Jeanette Wing (Vice President of Microsoft Research, and previously President’s Processor of Computer Science at Carnegie Mellon University, Pittsburgh) wrote a short but highly influential paper outlining the importance of computational thinking. That paper forms the basis of this unit within the IB course.
You should read it… https://www.cs.cmu.edu/~15110-s13/Wing06-ct.pdf
There are competing thoughts as to how best categorize computational thinking processes. The IB uses these six:
Others, such as Google and Code.org use the following alternative labels:
They all achieve the same point – a set of thinking skills that guides you through solving a problem.
Example: Cleaning the entire house/apartment, against cleaning the kitchen, bathroom, living room, bedrooms
Finally, Look for familiar things. Do not reinvent the wheel. If a solution exists, use it. If you have solved the same or a similar problem before, just repeat the successful solution. We usually do not consciously think, “I have seen this before, and I know what to do” - we just do it. Humans are good at recognizing similar situations. We do not have to learn how to go to the store and buy milk, then to buy eggs, then to buy candy. We know that going to the store is always the same and only what we buy is different.
Thinking logically is all about making decisions, and formalising the conditions that will affect those decisions. There are generally three steps:
Decisions may be multi faceted. Compare these two:
The question we ask ourselves in the “if” statement is known as the condition. We can use “boolean logic” to create multiple conditions for a single “if”. Both of these do the same thing:
if (time is 6:00am) and (day is weekday) then
wake up for school
end if
if (time is 6:00am) and ((day is Monday) or (day is Tuesday) or
(day is Wednesday) or (day is Thursday) or (day is Friday)) then
wake up for school
end if
Logical decisions can also be applied to things of a repetitive nature
while (hungry) then
raid pantry
eat
end
for (each episode of Game Of Thrones)
watch the episode
end
On 1st June 2009, Air France flight 447 left Rio de Janeiro heading to Paris. It was a routine international flight. In the early hours of the morning, over the Atlantic Ocean, contact was lost, and the aeroplane vanished.
On investigation, the plane showed signs of a high-speed impact with water as the nose cone was flattened. This ruled out a bomb or structural break-up. It was determined that the plane crashed into the water due to pilot error.
The plane flew through a thunderstorm. Other aeroplanes had diverted that night, as is standard practice in bad weather. The pitot tubes (speed sensors) had frozen over as a result. This caused the autopilot to switch off and incorrect readings to be sent to the cockpit. This is expected behaviour, and pilots are trained to recognise this. Believing that the plane was losing altitude, the pilot pulled back on the stick to raise the nose, in an attempt to gain height. The instruments continued to show the plane falling. If an aircraft’s nose is pointed up too far, it loses speed, causing the engines to stall. The correct action is to point the nose down, gaining speed, before levelling off.
With the aid of a flowchart, show how logical thinking could have avoided this accident.
You will be divided into groups. Each group will receive 8 film canisters and a balance scale. Your task is to sort the canisters from lightest to heaviest.
Use your procedural and logical thinking skills to devise the most efficient instructions for sorting the canisters. Count the number of “comparisons” your procedure makes. The winning team are the ones with the least comparisons (most efficient algorithm)
Thinking ahead is all about pre-planning and attempting to anticipate future needs.
What are the pre-conditions to solving the problem?
What are the post-conditions of the problem?
Anticipate exceptions to the rule
Examples of thinking ahead in daily life:
Can you solve the puzzle in Die Hard?
https://www.youtube.com/watch?v=6cAbgAaEOVE
You are given two eggs, and access to a 100-storey building. Both eggs are identical. The aim is to find out the highest floor from which an egg will not break when dropped out of a window from that floor. If an egg is dropped and does not break, it is undamaged and can be dropped again. However, once an egg is broken, that’s it for that egg.
If an egg breaks when dropped from floor n, then it would also have broken from any floor above that. If an egg survives a fall, then it will survive any fall shorter than that.
The question is: What strategy should you adopt to minimize the number egg drops it takes to find the solution?. (And what is the worst case for the number of drops it will take?)
** Hint **
Whilst it’s not strictly part of the puzzle, let’s first imagine what we’d do if we had only one egg.
Once this egg is broken, that’s it, no more egg. So, we really have no other choice but to start at floor 1. If it survives, great, we go up to floor 2 and try again, then floor 3 … all the way up the building; one floor at a time. Eventually the egg will break* and we’ll have a solution. For example, if it breaks on floor 57, we know that the highest floor that an egg can withstand a drop from is floor 56.
There’s no other one egg solution. If we’d been feeling lucky we could have gone up the floors in two’s but imagine if the egg broke on floor 16; we have no way of knowing if it would have also broken on floor 15!
** Hint part 2 **
At the other extreme, what if we had an infinite number of eggs? (Or at least as many eggs as we need). What would our strategy be here? In this case we’d use one of a programmer’s favorite tools, the binary tree.
First we’d go to floor 50 and drop an egg. It either breaks, or it does not. The outcome of this drop instantly cuts our problem in half. If it breaks, we know the solution lives in the bottom half of the building (floor 1 – floor 49). If it survives, we know the solution is in the top half of the building (floor 51 – 100). On each drop, we keep dividing the problem in half and half again until we get to our solution.
The mathematicians in the audience will quickly see that the number of drops required for this solution is log2 n, where n is the number of floors of the building. (This is like asking how many powers of two there are). ie: log2 100 = 6.644, or 7.
** Hint part 3 **
It does not take much imagination to see why a binary search solution will not work (optimally) for two eggs. Let’s imagine we did try a binary search and dropped our first egg from floor 50. If it broke, we’d be instantly reduced to a one egg problem.
What happens if we started off with our first egg going up by floors ten at a time? We can start dropping our egg from floor 10; if it survives we try floor 20, then floor 30 … we carry on until the first egg breaks. Once we’ve broken our first egg we know that the solution must lay somewhere in the nine floors just below, so we back off nine floors and step through these floors one at a time until we find a solution.
The question really comes down to: what is the optimal number of floors to skip with the first egg?
** Hint part 4 **
What we need is a solution that minimizes our maximum regret. We need is a strategy that tries to make solutions to all possible answers the same depth (same number of drops). The way to reduce the worst case is to attempt to make all cases take the same number of drops.
You have 10 boxes of ball bearings (each ball weighing exactly 10 gm) with one box with defective ball bearings (each one of the defects weigh 9 gm). You are given an electronic weighing machine and only one chance at it. How will find out which box has the defective ball bearings?
Concurrency = dealing with multiple things happening at the same time.
Tasks are broken down into subtasks that are then assigned to separate processors to perform simultaneously, instead of sequentially as they would have to be carried out by a single processor. (Oracle)
A non-computing example is the GANTT chart where multiple processes occur simultaneously. For example project managing the construction of a new house.
GPU’s are a popular and powerful way to do multithreaded programming on computers.
Syllabus note: Students will not be expected to construct an algorithm related to concurrent processing (but you may be expected to be able to interpret/understand/recognise one presented to you)
How long (comparisons) does it take to sort a deck of cards with:
Being able to create a meaningful model, or way to represent, a real world “thing” in a way that contains everything that is relevant, and nothing that isn’t.
Video: Robotics Academy (2016) Abstraction - Computational Thinking (2:28)
Real world examples:
When you take a real-world situation and are writing a program or algorithm for it, you are creating an abstraction: a way to represent that situation within the computer. As such you will make decisions about how that abstraction should be created such as what variables you need, what you will call them, what data type they will be, and how your algorithm will behave in response.
To succeed at thinking abstractly, you need to be able to take a problem and identify the parts that are relevant to your solution.
Jake & Jill’s weekly food shop
Jake and Jill are quite fed up of how long they spend in the supermarket each week doing their weekly food shop. They decide what they want when they are actually walking around the supermarket and they often have to go back multiple times in the week as they run out of items. This method of shopping is also resulting in a very expensive total weekly shopping bill!
How could they use the principles of computational thinking to make their weekly shopping experience as efficient as possible. There overall aims are to:
Taxi driver
A taxi driver uses his experience, a GPS navigation system and radio tuned to traffic information to work out how to get passengers from A to B.
In what ways is the taxi driver able to:
Program design is all about solving problems.
If you can’t solve and articulate the problem by hand, you will not be able to solve it with code!
There are three key strategies this course would like you to be familiar with for articulating and testing algorithms. They are:
There are a lot of different computer programming languages available, each serving different needs. Algorithms, however, are universal. A programmer who uses one language, should be able to communicate how to create an algorithm to another programmer who uses a completely different language without knowing anything about it!
For that reason, most of the time an algorithm is being written it won’t be in a language-specific format, but one of two generic forms: flow charts or pseudo-code.
What is pseudo code?
Notation that is intended to be language-neutral, so a programmer can understand what is required regardless of the languages they know.
Oxford dictionaries define “pseudo” as being Not genuine; spurious or sham.. In that way, pseudo-code is not a genuine programming language; It is a documentation tool. With that in mind there is no 100% “correct” or “incorrect” way of writing it. If you search “how to write pseudo code”, you will find a variety of different answers and syntax articulated. Don’t stress too much about memorising a “correct” way of writing it. If what you write achieves the purpose of providing a language neutral way of documenting an algorithm so a programmer can understand it regardless of the languages they know, then you have written good pseudo code.
IB expectations
You need to be able to create as well as interpret/analyse pseudo code.
Where answers are to be written in pseudocode, the examiners will be looking for clear algorithmic thinking to be demonstrated. In examinations, this type of question will be written in the approved notation, so a familiarity with it is expected. It is accepted that under exam conditions candidates may, in their solutions, use pseudocode similar to a programming language with which they are familiar. This is acceptable. The markscheme will be written using the approved notation. Provided the examiners can see the logic in the candidate’s response, regardless of language, it will be credited. No marks will be withheld for syntax errors … for answers to be written in pseudo code
IB syntax
Given the statements made above about there not being a “correct” way to write it, there is a set syntax that the IB will use when they write pseudo code questions for you in the exam. As per the expectations comments above, you are not obligied to follow their syntax in your responses, and you will not lose credit for doing so, but you should be familiar enough with their syntax to be able to properly interpret the questions they give you.
These methods, in their pseudocode format, may be used without explanation or clarification in examination questions. Teachers should ensure that candidates will be able to interpret these methods when presented as part of an examination question.
The following comes from the IB documents
IB exams will not require you to create your own flow charts but they will present flow charts to you for analysis and interpretation.
Check the previous pages for the “official” IB method of presenting flow chart questions to you.
A trace table is a technique used to test algorithms, in order to make sure that no logical errors occur whilst the algorithm is being processed. The table usually takes the form of a multi-column, multi-row table; With each column showing a variable, and each row showing each number input into the algorithm and the subsequent values of the variables (Wikipedia: Trace table).
Trace tables are a useful way of checking the algorithm is going to behave the way you want it to. You should have columns for all variables, and for all comparison checks.
You are expected to be able to analyse and create trace tables for different algorithms. You could, for instance, be asked to:
Question 1
Write pseudo code that will sum all the even numbers input from a user. Stop after 10 numbers have been input.
Question 2
Write pseudo code that reads in any three numbers and outputs them into sorted order.
Question 3
Write pseudo code that will perform the following:
Question 4
Write pseudo code that will calculate a running sum. A user will enter numbers that will be added to the sum and when a negative number is encountered, stop adding numbers and write out the final result.
Question 5: Cricket scores
SCORES is a collection that stores the runs scored by a cricketer in each match he played.
In your answer, make use of the collections functions of isEmpty(), hasNext(), getNext(), resetNext() and addItem()
Write pseudocode for the following:
(sourced from Rajesh Kumar via IB teacher forums)
Question 6: Rainfall
The rainfall of each day during a monsoon period is stored in a collection RAINFALL. Write pseudocode for the following:
(sourced from Rajesh Kumar via IB teacher forums)
Question 7: Hailstone problem
The Hailstone Series is generated using the following high level algorithm:
This series will eventually reach the repeating ”ground” state: 4, 2, 1, 4, 2, 1
Here is the sequence generated for an initial value of 26:
Task 1: Convert the high-level algorithm into pseudo-code subject to the following:
Task 2: Produce a trace table for your Hailstone Series algorithm, given an input of 17.
Question 8: Fizz buzz
This is a “famous” programming job interview question.
Example output: 1, 2, Fizz, 4, Buzz, Fizz, 7, 8, Fizz, Buzz, 11, Fizz, …
Task: Create the pseudo code for a Fizz Buzz generater, and use a trace table to for values up to 15.
Question 9: Secret number guesser
Create pseudo code for an algorithm where the computer picks a “secret” number at random from 1 to 100. The user is then repeatedly prompted to guess the number, the computer responds with “too low”, “too high”, or “correct!”. The computer should also keep count of the number of guesses and let the user know how many guesses it required at the end.
Swap with your neighbour to test each others algorithms with trace tables. Have they documented it correctly?
Assume availability of a random number generator function to complete this exercise. The function can be Math.random( maximum ) where a number will be generated between 0 and the maximum.
Task:
Question 10: An IB question
The following exercise comes from the May 2015 IB exam.
Trace the following algorithmic fragment for N = 6.
Show all working in a trace table.
SUM = 0
loop COUNT from 1 to (N div 2)
if N mod COUNT = 0 then
SUM = SUM + COUNT
end if
end loop
if SUM = N then
output "perfect"
else
output "not perfect"
end if
More questions
The following past paper questions include those that require use arrays and collections. Make sure you are confident with those for your exams.
There are a number of algorithms the IB course will assume you know by memory. These are the “standard algorithms”.
Any algorithm not defined as a “standard algorithm” can be considered a “novel algorithm” and will be either presented to you in the exam or is an algorithm you would be expected to devise.
Search through a set of data sequentially until you find the item you are looking for.
function sequential( haystack, needle ) {
for (var i=0; i<haystack.length; i++) {
if (needle == haystack[i]) {
return(i);
}
}
return(-1);
}
One interesting thing to note with the sequential search is it does not require your data to be sorted ahead of the search. This could save a lot of processing time. If, however, your data is already sorted, you can include within your sequential search a test to see if we have already gone past the point at which the record we are looking for would exist. In that case, we can abort the remainder of the search and return a “not found” result.
A binary search divides a range of values into halves, and continues to narrow down the field of search until the unknown value is found. It is the classic example of a “divide and conquer” algorithm. Your data must be pre-sorted!
function binary( haystack, needle ) {
var max = haystack.length-1;
var min = 0;
while (max >= min) {
var mid = Math.floor((min+max)/2);
if (haystack[mid] == needle) {
return mid;
}
if (haystack[mid] > needle) {
max = mid - 1;
}
if (haystack[mid] < needle) {
min = mid + 1;
}
}
return (-1);
}
Note: This is known as an iterative binary search. There is also a recursive binary search which we will look at in a later unit. Just be aware if you have to google binary search that there are two types.
Question 1: Tracing binary search
Create a trace table on the binary search algorithm above with the following values:
Compare that on a trace table for a sequential search algorithm.
It should be fairly simple to code an implementation of this to test your algorithms - have a go.
Question 2: Simple spell checker
If user inputs a sentence, separate the individual words, strip out any punctuation, check each word against your dictionary list. Print each word out with PASS or FAIL against it based on if it was found in your dictionary.
Use the 7 step process to devise an algorithm for this and have a go at implementing it if you wish. Will you use the sequential or binary search? Why?
Bubble sort: The highest value “bubbles” to the top each round.
function bubbleSort( data ) {
var done = false;
while (!done) {
done = true;
for (var i=1; i<data.length; i++) {
if (data[i] < data[i-1]) {
done = false;
var temp = data[i-1];
data[i-1] = data[i];
data[i] = temp;
}
}
}
return(data);
}
The selection sort is just as inefficient as the bubble search, so why bother learning both? The big advantage for the selection sort is it optimised for when writing data is very expensive (slow) when compared to reading, eg. writing to flash memory or EEPROM. No other sorting algorithm has less data movement! This is important to realise as you can be asked questions justifying the use of one algorithm over another.
function selectionSort( data ) {
for (var i=0; i<data.length; i++) {
var indexOfLowest = i;
for (var j=i; j<data.length; j++) {
if (data[j] < data[indexOfLowest]) {
indexOfLowest = j;
}
}
var temp = data[i];
data[i] = data[indexOfLowest];
data[indexOfLowest] = temp;
}
return(data);
}
To see an animation of either sort in action look at https://visualgo.net/bn/sorting
Question 1
Go to https://www.random.org/integer-sets/ and produce 1 set of 50 random integers. Be sure to turn off the tick box that sorts the results for you, you want unsorted data.
Implement both sorting algorithms to process your data set. Create two programs that will load the array, sort it, and then output the list of names to the screen. Your two programs is to create one version that implements a bubble sort, and one version that implements a selection sort.
Question 2
Once you are successfully sorting numbers, how about sorting strings such as a list of names?
Question 3
Sort a set of dates given in dd/mm/yyyy format into their correct calendar order so the date which occurs first, appears first in the list.
From the IB CS syllabus:
Students should understand and explain the difference in efficiency between a single loop, nested loops, a loop that ends when a condition is met or questions of similar complexity. Students should also be able to suggest changes in an algorithm that would improve efficiency, for example, using a flag to stop a search immediately when an item is found, rather than continuing the search through the entire list. Examination questions will involve specific algorithms (in pseudocode/flowcharts), and students may be expected to give an actual number (or range of numbers) of iterations that a step will execute.
As you can see from the above, it is important to understand the efficiency of different algorithms. One measure used in industry is called Big O notation.
Big-O describes the relationship of how the run time will scale with respect to certain input variables.
Some common examples of Big-O expressions:
O(1)
- constantO(log(n))
- logarithmicO(n)
- linearO(n^2)
- quadraticO(n^c)
- polynomialO(c^n)
- exponentialIf the run time will increase linearly with respect to the size of the input data, this would be O(n). An example is to loop through the values of an array.
If the run time will increase exponentially with respect to the size of the input data, this would be O(n^2). An example is to have a loop instead a loop, where both iterate through the values of an array.
A good introduction to Big O notation can be found with the following video:
There were four important rules from the video:
function something() {
doTask1() // O(a)
doTask2() // O(b)
}
// Overall result = O(a+b)
function something() {
for each item in array: // O(n)
min = MIN(item, min)
for each item in array: // O(n)
max = MAX(item, max)
}
// The overall run time will still increase linearly, so it is O(n) not O(2n).
function something() {
for each item in A:
total = total + item
for each item in B:
total = total + item
return total
}
// The overall runtime would be O(a*b)
For example, if an algorithm as a nested for-loop which would be O(n^2) and a regular for-loop O(n), it is not O(n + n^2) because the O(n) is going to be completely dominated by the O(n^2) to such a degree that it is meaningless to worry about.
Here is also a great analogy from the comments section of that video: Let’s say you’re making dinner for your family. O is the process of following a recipe, and n is the number of times you follow a recipe.
For a graphical comparison that illustrates the difference in processing load of different O() algorithms, check the bigocheatsheet.com website.
Question 1
boolean containsValue(int[] list, int val) {
for (int item : list) {
if item==val {
return true;
}
}
return false;
}
Question 2
Boolean containsDuplicates(int[] a, int[] b) {
for (int x : a) {
for (int y : b) {
if (x==y) { return true; }
}
}
return false;
}
Question 3
int fibonacci(int n) {
if (n <= 1) { return n; }
int fib = 1;
int prev = 1;
for (int i=2; i<n; i++) {
int temp = fib;
fib += prev;
prev = temp}
return fib
}
Question 4
boolean isFirstElementZero(int[] a) {
if (a[0] == 0) {
return true;
} else {
return false;
}
}
Question 5
Which is algorithmically more efficient?
You are maintaining a customer contact details database for a sales business, currently with 1+ million records. On any given day, several hundred customers will notify a change of address for the database. Additionally, on any given day, the support staff will receive several thousand calls through the switchboard, where they will have to look up the customer data.
Should your program…?
Justify your choice with reference to the O(n) of each method. [5 marks]
Question 6
A supermarket inventory system needs to be updated to keep an accurate record of the various stock on hand. The supermarket carries a range of about 10 000 different items on its shelves. The hard disk of the computer used is efficient at performing data lookups, but takes about 100 times longer to write to disk than a lookup.
With approximately 1000 transactions per day, would it be more efficient to:
The following assumes you have already been introduced to a programming language. It only outlines the theory content associated with unit 4.3 not the introduction to a programming language.
As we saw in unit 2, the CPU uses the fetch-decode-execute-store cycle.
Software programs are sets of instructions. For a CPU to execute these instructions, each one must first be translated into machine code – simple binary codes that activate parts of the CPU.
The CPU only performs a few basic functions:
A piece of software, such as a game or web browser, combines these functions to perform more complex tasks. These are known as compound operations.
A computer program is a list of instructions that enable a computer to perform a specific task.
Computer programs can be written in high and low level languages, depending on the task and the hardware being used.
When we think about computer programmers, we are probably thinking about people who write in high-level languages.
High Level Languages
High level languages are written in a form that is close to our human language, enabling to programmer to just focus on the problem being solved.
No particular knowledge of the hardware is needed as high level languages create programs that are portable and not tied to a particular computer or microchip.
These programmer friendly languages are called ‘high level’ as they are far removed from the machine code instructions understood by the computer.
Examples include: C++, Java, Pascal, Python, Visual Basic.
Advantages
Low Level Languages
Low level languages are used to write programs that relate to the specific architecture and hardware of a particular type of computer.
They are closer to the native language of a computer (binary), making them harder for programmers to understand.
Low level lanuages include Assembly Language and Machine Code.
Assembly Language
Typical assembly language opcodes include: add, subtract, load, compare, branch, store
Advantages
Machine Code
Compiler
A compiler translates a human-readable program directly into an executable, machine-readable form before the program can run.
Interpreter
An interpreter translates a human-readable program into an executable, machine-readable form, instruction by instruction. It then executes each translated instruction before moving on to the next one. A program is translated every time it is run. Python is an example of an interpreted language.
Regardless of the programming language you learn, there are core constructs that are generally available in all major, modern programming langauges. Be sure you are aware of the terminology involved.
We’re going to do sone programming practice shortly, so, before we do let’s review some useful strategies to debug our programs.
How to ask for help
Before asking anyone about a bug in your code, it’s important that you make sure you have all of the following components of an excellent question:
Source: hartleybrody (2016)
Refer to link on main course page
Computer Science Illuminated by Nell Dale & John Lewis (page numbers based on 6th edition):