Factoring Algorithm puzzle
What is the minimal number, X, of yes/no questions needed to find the smallest positive integer divisor (other than 1) of an integer between 2 and 166 (inclusive)?
We are asking for the exact answer in two cases:
- In the worst case, i.e., what is the smallest number X for which we can guarantee finding it in no more than X questions.
- On average, i.e., assuming that the number was chosen in uniform distribution from 2 to 166 and we want to minimize the expected number of questions.
This problem was IBM’s Ponder This puzzle of November 2009.
I’m going to write my solution in English, along with code (both python and java) so anyone knowing any one of the 3 languages should be able to follow along.
For both #1 and #2 there are some relevant thoughts:
the smallest integer dividers (above 1) of all integers (above 1) are prime, so the divisor we are looking for will be prime, specifically only a few prime numbers that will be relevant between 2 and 166 (inclusively):
to find out about these primes, I wrote a little code:
Python (3.1) :
prime_list = []
maxx = 166
for n in range(2,maxx+1):
isprime = True
for p in prime_list:
if n % p is 0;
isprime = False;
break;
if isprime == True:
prime_list.append(n)
print("the primes less than",maxx+1,"are:")
print(prime_list)
print("there are ", len(prime_list), "of them")
OR Java :
import java.util.ArrayList;
public class primes {
public static ArrayList<Integer> prime_list = new ArrayList<Integer>();
public static void main(String args[]){
int maxx = 166;
boolean isPrime;
for(int n=2; n <= maxx; n++){
isPrime = true;
for(int p : prime_list){
if(n % p == 0){
isPrime = false;
break;
}
}
if(isPrime){
prime_list.add(n);
}
}
System.out.println("the primes less than"+ maxx+1+ " are:");
System.out.println(prime_list);
System.out.println("there are " + prime_list.size() + " of them");
}
}
Both the Python and the Java version of the logic will output the following:
the primes less than 167 are: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163] there are 38 of them
these 38 primes are the only possible ‘smallest integer dividers’ which we are trying to determine with our questions.
Worst Case (part #1 of the Puzzle) seems like the easier problem to solve, so lets do it.
In the worst case, i.e., what is the smallest number X for which we can guarantee finding it in no more than X questions.
well there are 38 possible solutions, and we only care about worst case, so lets gather half of them and ask (Question 1) if the smallest integer divisor (>1) is in that group.
if the answer is yes, then we have narrowed it down to 19, if the answer is 19 then we have narrowed it down to the other 19, either way we are left with 19.
Now take that 19 and group roughly half of them together (10) and ask (Question 2) if the smallest integer divisor (>1) is in that group.
if the answer is yes, then we have narrowed it down to 10, if the answer is no, then we have narrowed it down to 9.
Now take that 10 (or 9) and group together about half (5) and ask (Question 3) if the smallest integer divisor (>1) is in that group.
If the answer isyes, then we have narrowed it down to 5, otherwise we have also narrowed it down to 5 or 4 (depending if we had 10 or 9).
Now take those 5 or 4 and take roughly half (2) and ask (Question 4) if the smallest integer divisor is one of the two.
If the answer is yes, then we have narrowed it down to 2, otherwise we have narrowed it down to 3 or 2 (depending if we had 5 or 4)
Now take those 3 or 2 and take the first integer, and ask (Question 5) if that’s the greatest integer divisor (>1).
If yes, then we are done, that’s the number, if not and we started with 2, we are also done (the other is the number), if we had 3 then we have to ask one more question.
Take the 2 remaining and ask (Question 6 (not always required)) if that’s the greatest integer divisor (>1).
If yes, then we are done, that’s the number, if not then the other number is it.
So this method has a worst case of 6, a best case of 5, what about the average case?
here’s the code so far in Python 3.1:
prime_list = []
maxx = 166
for n in range(2,maxx+1):
isprime = True
for p in prime_list:
if n % p is 0;
isprime = False;
break;
if isprime == True:
prime_list.append(n)
print("the primes less than",maxx+1,"are:")
print(prime_list)
print("there are ", len(prime_list), "of them")
def smallestPrimeFactor(n):
for p in prime_list:
if n/p == int(n/p):
return p
question_history=[]
for n in range(2,maxx+1):
prime_sublist = prime_list
question_count=0
while True:
question_count+=1
if smallestPrimeFactor(n) in prime_sublist[0:int(len(prime_sublist)/2)]:
prime_sublist = prime_sublist[0:int(len(prime_sublist)/2)]
else:
prime_sublist = prime_sublist[int(len(prime_sublist)/2):]
if len(prime_sublist)==1:
print( "number =", n, " smallest factor (>1) =", prime_sublist[0], " questions asked =", question_count)
question_history.append(question_count)
break
print( "worst case =", max(question_history) )
print( "best case =", min(question_history) )
print( "average case =", sum(question_history) / (maxx-1) )
OR the program so far in Java:
import java.util.ArrayList;
import java.util.List;
public class primes {
public static ArrayList<Integer> prime_list = new ArrayList<Integer>();
public static void main(String args[]){
int maxx = 166;
boolean isPrime;
for(int n=2; n <= maxx; n++){
isPrime = true;
for(int p : prime_list){
if(n % p == 0){
isPrime = false;
break;
}
}
if(isPrime){
prime_list.add(n);
}
}
System.out.println("the primes less than 167 are:");
System.out.println(prime_list);
System.out.println("there are " + prime_list.size() + " of them");
int max_questions = -1;
int min_questions = -1;
int questions_sum = 0;
for(int n=2; n <= maxx; n++){
List<Integer> prime_sublist = prime_list;
int question_count = 0;
while(true){
question_count++;
if(prime_sublist.subList(0,prime_sublist.size()/2).contains(smallestPrimeFactor(n))){
prime_sublist = prime_sublist.subList(0,prime_sublist.size()/2);
}else{
prime_sublist = prime_sublist.subList(prime_sublist.size()/2,prime_sublist.size());
}
if(prime_sublist.size() == 1){
System.out.println("number = " + n + " smallest factor (>1) = " + prime_sublist.get(0) + " questions asked = " + question_count);
if(question_count > max_questions){
max_questions = question_count;
}
if(question_count < min_questions || min_questions == -1){
min_questions = question_count;
}
questions_sum += question_count;
break;
}
}
}
System.out.println("worst case = "+max_questions);
System.out.println("best case = "+min_questions);
System.out.println("average case = "+(float) questions_sum/(maxx-1));
}
public static int smallestPrimeFactor(int n){
for(int p: prime_list){
if((n / p) == (float)n / p){
return p;
}
}
return 0;
}
}
The output for either of these programs is this:
the primes less than 167 are: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163] there are 38 of them number = 2 smallest factor (>1) = 2 questions asked = 5 number = 3 smallest factor (>1) = 3 questions asked = 5 number = 4 smallest factor (>1) = 2 questions asked = 5 number = 5 smallest factor (>1) = 5 questions asked = 5 number = 6 smallest factor (>1) = 2 questions asked = 5 number = 7 smallest factor (>1) = 7 questions asked = 5 number = 8 smallest factor (>1) = 2 questions asked = 5 number = 9 smallest factor (>1) = 3 questions asked = 5 number = 10 smallest factor (>1) = 2 questions asked = 5 number = 11 smallest factor (>1) = 11 questions asked = 5 number = 12 smallest factor (>1) = 2 questions asked = 5 number = 13 smallest factor (>1) = 13 questions asked = 5 number = 14 smallest factor (>1) = 2 questions asked = 5 number = 15 smallest factor (>1) = 3 questions asked = 5 number = 16 smallest factor (>1) = 2 questions asked = 5 number = 17 smallest factor (>1) = 17 questions asked = 5 number = 18 smallest factor (>1) = 2 questions asked = 5 number = 19 smallest factor (>1) = 19 questions asked = 6 number = 20 smallest factor (>1) = 2 questions asked = 5 number = 21 smallest factor (>1) = 3 questions asked = 5 number = 22 smallest factor (>1) = 2 questions asked = 5 number = 23 smallest factor (>1) = 23 questions asked = 6 number = 24 smallest factor (>1) = 2 questions asked = 5 number = 25 smallest factor (>1) = 5 questions asked = 5 number = 26 smallest factor (>1) = 2 questions asked = 5 number = 27 smallest factor (>1) = 3 questions asked = 5 number = 28 smallest factor (>1) = 2 questions asked = 5 number = 29 smallest factor (>1) = 29 questions asked = 5 number = 30 smallest factor (>1) = 2 questions asked = 5 number = 31 smallest factor (>1) = 31 questions asked = 5 number = 32 smallest factor (>1) = 2 questions asked = 5 number = 33 smallest factor (>1) = 3 questions asked = 5 number = 34 smallest factor (>1) = 2 questions asked = 5 number = 35 smallest factor (>1) = 5 questions asked = 5 number = 36 smallest factor (>1) = 2 questions asked = 5 number = 37 smallest factor (>1) = 37 questions asked = 5 number = 38 smallest factor (>1) = 2 questions asked = 5 number = 39 smallest factor (>1) = 3 questions asked = 5 number = 40 smallest factor (>1) = 2 questions asked = 5 number = 41 smallest factor (>1) = 41 questions asked = 6 number = 42 smallest factor (>1) = 2 questions asked = 5 number = 43 smallest factor (>1) = 43 questions asked = 6 number = 44 smallest factor (>1) = 2 questions asked = 5 number = 45 smallest factor (>1) = 3 questions asked = 5 number = 46 smallest factor (>1) = 2 questions asked = 5 number = 47 smallest factor (>1) = 47 questions asked = 5 number = 48 smallest factor (>1) = 2 questions asked = 5 number = 49 smallest factor (>1) = 7 questions asked = 5 number = 50 smallest factor (>1) = 2 questions asked = 5 number = 51 smallest factor (>1) = 3 questions asked = 5 number = 52 smallest factor (>1) = 2 questions asked = 5 number = 53 smallest factor (>1) = 53 questions asked = 5 number = 54 smallest factor (>1) = 2 questions asked = 5 number = 55 smallest factor (>1) = 5 questions asked = 5 number = 56 smallest factor (>1) = 2 questions asked = 5 number = 57 smallest factor (>1) = 3 questions asked = 5 number = 58 smallest factor (>1) = 2 questions asked = 5 number = 59 smallest factor (>1) = 59 questions asked = 5 number = 60 smallest factor (>1) = 2 questions asked = 5 number = 61 smallest factor (>1) = 61 questions asked = 6 number = 62 smallest factor (>1) = 2 questions asked = 5 number = 63 smallest factor (>1) = 3 questions asked = 5 number = 64 smallest factor (>1) = 2 questions asked = 5 number = 65 smallest factor (>1) = 5 questions asked = 5 number = 66 smallest factor (>1) = 2 questions asked = 5 number = 67 smallest factor (>1) = 67 questions asked = 6 number = 68 smallest factor (>1) = 2 questions asked = 5 number = 69 smallest factor (>1) = 3 questions asked = 5 number = 70 smallest factor (>1) = 2 questions asked = 5 number = 71 smallest factor (>1) = 71 questions asked = 5 number = 72 smallest factor (>1) = 2 questions asked = 5 number = 73 smallest factor (>1) = 73 questions asked = 5 number = 74 smallest factor (>1) = 2 questions asked = 5 number = 75 smallest factor (>1) = 3 questions asked = 5 number = 76 smallest factor (>1) = 2 questions asked = 5 number = 77 smallest factor (>1) = 7 questions asked = 5 number = 78 smallest factor (>1) = 2 questions asked = 5 number = 79 smallest factor (>1) = 79 questions asked = 5 number = 80 smallest factor (>1) = 2 questions asked = 5 number = 81 smallest factor (>1) = 3 questions asked = 5 number = 82 smallest factor (>1) = 2 questions asked = 5 number = 83 smallest factor (>1) = 83 questions asked = 5 number = 84 smallest factor (>1) = 2 questions asked = 5 number = 85 smallest factor (>1) = 5 questions asked = 5 number = 86 smallest factor (>1) = 2 questions asked = 5 number = 87 smallest factor (>1) = 3 questions asked = 5 number = 88 smallest factor (>1) = 2 questions asked = 5 number = 89 smallest factor (>1) = 89 questions asked = 5 number = 90 smallest factor (>1) = 2 questions asked = 5 number = 91 smallest factor (>1) = 7 questions asked = 5 number = 92 smallest factor (>1) = 2 questions asked = 5 number = 93 smallest factor (>1) = 3 questions asked = 5 number = 94 smallest factor (>1) = 2 questions asked = 5 number = 95 smallest factor (>1) = 5 questions asked = 5 number = 96 smallest factor (>1) = 2 questions asked = 5 number = 97 smallest factor (>1) = 97 questions asked = 5 number = 98 smallest factor (>1) = 2 questions asked = 5 number = 99 smallest factor (>1) = 3 questions asked = 5 number = 100 smallest factor (>1) = 2 questions asked = 5 number = 101 smallest factor (>1) = 101 questions asked = 5 number = 102 smallest factor (>1) = 2 questions asked = 5 number = 103 smallest factor (>1) = 103 questions asked = 6 number = 104 smallest factor (>1) = 2 questions asked = 5 number = 105 smallest factor (>1) = 3 questions asked = 5 number = 106 smallest factor (>1) = 2 questions asked = 5 number = 107 smallest factor (>1) = 107 questions asked = 6 number = 108 smallest factor (>1) = 2 questions asked = 5 number = 109 smallest factor (>1) = 109 questions asked = 5 number = 110 smallest factor (>1) = 2 questions asked = 5 number = 111 smallest factor (>1) = 3 questions asked = 5 number = 112 smallest factor (>1) = 2 questions asked = 5 number = 113 smallest factor (>1) = 113 questions asked = 5 number = 114 smallest factor (>1) = 2 questions asked = 5 number = 115 smallest factor (>1) = 5 questions asked = 5 number = 116 smallest factor (>1) = 2 questions asked = 5 number = 117 smallest factor (>1) = 3 questions asked = 5 number = 118 smallest factor (>1) = 2 questions asked = 5 number = 119 smallest factor (>1) = 7 questions asked = 5 number = 120 smallest factor (>1) = 2 questions asked = 5 number = 121 smallest factor (>1) = 11 questions asked = 5 number = 122 smallest factor (>1) = 2 questions asked = 5 number = 123 smallest factor (>1) = 3 questions asked = 5 number = 124 smallest factor (>1) = 2 questions asked = 5 number = 125 smallest factor (>1) = 5 questions asked = 5 number = 126 smallest factor (>1) = 2 questions asked = 5 number = 127 smallest factor (>1) = 127 questions asked = 5 number = 128 smallest factor (>1) = 2 questions asked = 5 number = 129 smallest factor (>1) = 3 questions asked = 5 number = 130 smallest factor (>1) = 2 questions asked = 5 number = 131 smallest factor (>1) = 131 questions asked = 6 number = 132 smallest factor (>1) = 2 questions asked = 5 number = 133 smallest factor (>1) = 7 questions asked = 5 number = 134 smallest factor (>1) = 2 questions asked = 5 number = 135 smallest factor (>1) = 3 questions asked = 5 number = 136 smallest factor (>1) = 2 questions asked = 5 number = 137 smallest factor (>1) = 137 questions asked = 6 number = 138 smallest factor (>1) = 2 questions asked = 5 number = 139 smallest factor (>1) = 139 questions asked = 5 number = 140 smallest factor (>1) = 2 questions asked = 5 number = 141 smallest factor (>1) = 3 questions asked = 5 number = 142 smallest factor (>1) = 2 questions asked = 5 number = 143 smallest factor (>1) = 11 questions asked = 5 number = 144 smallest factor (>1) = 2 questions asked = 5 number = 145 smallest factor (>1) = 5 questions asked = 5 number = 146 smallest factor (>1) = 2 questions asked = 5 number = 147 smallest factor (>1) = 3 questions asked = 5 number = 148 smallest factor (>1) = 2 questions asked = 5 number = 149 smallest factor (>1) = 149 questions asked = 5 number = 150 smallest factor (>1) = 2 questions asked = 5 number = 151 smallest factor (>1) = 151 questions asked = 5 number = 152 smallest factor (>1) = 2 questions asked = 5 number = 153 smallest factor (>1) = 3 questions asked = 5 number = 154 smallest factor (>1) = 2 questions asked = 5 number = 155 smallest factor (>1) = 5 questions asked = 5 number = 156 smallest factor (>1) = 2 questions asked = 5 number = 157 smallest factor (>1) = 157 questions asked = 6 number = 158 smallest factor (>1) = 2 questions asked = 5 number = 159 smallest factor (>1) = 3 questions asked = 5 number = 160 smallest factor (>1) = 2 questions asked = 5 number = 161 smallest factor (>1) = 7 questions asked = 5 number = 162 smallest factor (>1) = 2 questions asked = 5 number = 163 smallest factor (>1) = 163 questions asked = 6 number = 164 smallest factor (>1) = 2 questions asked = 5 number = 165 smallest factor (>1) = 3 questions asked = 5 number = 166 smallest factor (>1) = 2 questions asked = 5 worst case = 6 best case = 5 average case = 5.072727
So for this algorithm (let’s call it the worst-case algorithm), we found the possible solution set of 38 primes, then we divided it in half and asked which half iteratively until we were had a set of 1, and that’s the solution.
This is an intuitive way to have the best worst-case in terms of number of questions.
Dividing the set in half for each question is the most amount of progress we can make, getting rid of half the dataset each question.
And that’s the goal, getting to the solution in the least amount of questions in the worst case…
Puzzle part 2:
On average, i.e., assuming that the number was chosen in uniform distribution from 2 to 166 and we want to minimize the expected number of questions.
In the average case, our worst-case method is not very productive, look at the first question:
well there are 38 possible solutions, and we only care about worst case, so lets gather half of them and ask (Question 1) if the smallest integer divisor (>1) is in that group.
so dividing the set in two we have [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67] and [71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163]
but since 2 is the solution more than half the time, and 3 being the 2nd most likely, you can see that we really aren’t asking a 50/50 question. In this case we are asking a 146/19 question, in the average case we aren’t really making a lot of progress with that.
Le’t’s figure out the probabilty of the relevant primes in this dataset:
Python 3.1:
prime_count={p:0 for p in prime_list}
for n in range(2,maxx+1):
prime_count[smallestPrimeFactor(n)]+=1
n=0
for p in prime_list:
n+=1
print("prime#",n,"prime =",p," # of times it is the solution =", prime_count[p])
OR in java:
Hashtable<Integer,Integer> prime_count = new Hashtable<Integer,Integer>();
for(int p:prime_list){
prime_count.put(p,0);
}
for(int n=2;n<= maxx;n++){
prime_count.put(smallestPrimeFactor(n),prime_count.get(smallestPrimeFactor(n))+1);
}
int n = 0;
for(int p: prime_list){
n++;
System.out.println("prime#" + n + " prime = " + p + " # of times it is the solution = " + prime_count.get(p));
}
Both programs output this:
prime#01 prime = 2 # of times it is the solution = 83 prime#02 prime = 3 # of times it is the solution = 28 prime#03 prime = 5 # of times it is the solution = 11 prime#04 prime = 7 # of times it is the solution = 7 prime#05 prime = 11 # of times it is the solution = 3 prime#06 prime = 13 # of times it is the solution = 1 prime#07 prime = 17 # of times it is the solution = 1 prime#08 prime = 19 # of times it is the solution = 1 prime#09 prime = 23 # of times it is the solution = 1 prime#10 prime = 29 # of times it is the solution = 1 prime#11 prime = 31 # of times it is the solution = 1 prime#12 prime = 37 # of times it is the solution = 1 prime#13 prime = 41 # of times it is the solution = 1 prime#14 prime = 43 # of times it is the solution = 1 prime#15 prime = 47 # of times it is the solution = 1 prime#16 prime = 53 # of times it is the solution = 1 prime#17 prime = 59 # of times it is the solution = 1 prime#18 prime = 61 # of times it is the solution = 1 prime#19 prime = 67 # of times it is the solution = 1 prime#20 prime = 71 # of times it is the solution = 1 prime#21 prime = 73 # of times it is the solution = 1 prime#22 prime = 79 # of times it is the solution = 1 prime#23 prime = 83 # of times it is the solution = 1 prime#24 prime = 89 # of times it is the solution = 1 prime#25 prime = 97 # of times it is the solution = 1 prime#26 prime = 101 # of times it is the solution = 1 prime#27 prime = 103 # of times it is the solution = 1 prime#28 prime = 107 # of times it is the solution = 1 prime#29 prime = 109 # of times it is the solution = 1 prime#30 prime = 113 # of times it is the solution = 1 prime#31 prime = 127 # of times it is the solution = 1 prime#32 prime = 131 # of times it is the solution = 1 prime#33 prime = 137 # of times it is the solution = 1 prime#34 prime = 139 # of times it is the solution = 1 prime#35 prime = 149 # of times it is the solution = 1 prime#36 prime = 151 # of times it is the solution = 1 prime#37 prime = 157 # of times it is the solution = 1 prime#38 prime = 163 # of times it is the solution = 1
To develop the best average-case algorithm we need to evenly divide the solution set in 2 for each question, but not even in the number of primes in each set, but even in probability. We will try to get the sum of the number of times the prime is a solution to be the same in each half. Effectively choosing the division so that the question will be a 50/50 question.
Question 1 is easy (since 2 is the solution to just over half the numbers):
the set will be split into [2] and [ 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163]
this is a 83/82 question, nearly 50/50…
there is probability = 83/165 that the answer to this question will pinpoint the solution at 2.
Question 2 (if necessary): We now must divide the remaining into two nearly* equally probable sets:
so we can split the remaining up into sets like this:
[3,5,7] vs [11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163]
with weightings 28+11+7 vs 3+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1
so that’s 46 vs. 36
It’s very important to note that you can get more balanced groupings, for example grouping 3 (with occurrence score of 28) with 13 of the larger primes (each with 1 occurrence), yielding 42 vs. 42…
The problem with that is you will have a group with a 28 and 13 1’s, so the next matchup if that group is selected will be very imbalanced.
Since we cannot divide up the probability beyond per prime, we should need to consider these cases and not go for perfectly half every time, but instead find the best division for all the possible subsets that will be dealt with in later questions, that will give us the best average case. This is a very significant and interesting puzzle in itself, It’s almost a shame that a little luck and a naive approach to this can still yield the optimal solution for this dataset, but in a different dataset, this problem could not be glossed over.
Here is the algorithm to optimally construct the tree, with the English description following…
Python:
keys = [] values = [] for p in sorted(prime_count): keys.append([p]) values.append(prime_count[p]) print(keys) print(values while len(keys) > 2: min1_value = min(values) min1_index = values.index(min1_value) min1_key = keys[min1_index] del values[min1_index] del keys[min1_index] min2_value=min(values) min2_index = values.index(min2_value) min2_key = keys[min2_index] del keys[min2_index] del values[min2_index] keys.append(min1_key + min2_key) values.append(min1_value + min2_value) print() print(keys) print(values)
OR Java:
for(int p:prime_list){
HashSet<Integer> temp = new HashSet<Integer>();
temp.add(p);
keys.add(temp);
values.add(prime_count.get(p));
}
ArrayList<HashSet<Integer>> in_sets = new ArrayList<HashSet<Integer>>();
ArrayList<HashSet<Integer>> out_sets = new ArrayList<HashSet<Integer>>();
while(keys.size()>1){
int tmp_index = 0;
int min1_v = 0;
int min2_v = 0;
int min1_i = 0;
int min2_i = 0;
HashSet<Integer> min1_k = new HashSet<Integer>();
HashSet<Integer> min2_k = new HashSet<Integer>();
HashSet<Integer> min_union_k = new HashSet<Integer>();
for(int v:values){
if(tmp_index == 0 | v < min1_v){
min2_v = min1_v;
min2_k = min1_k;
min2_i = min1_i;
min1_v = v;
min1_k = keys.get(tmp_index);
min1_i = tmp_index;
}else if(v < min2_v | tmp_index == 1){
min2_v = v;
min2_k = keys.get(tmp_index);
min2_i = tmp_index;
}
tmp_index++;
}
min_union_k.addAll(min1_k);
min_union_k.addAll(min2_k);
values.remove(Math.max(min1_i, min2_i));
values.remove(Math.min(min1_i, min2_i));
keys.remove(Math.max(min1_i, min2_i));
keys.remove(Math.min(min1_i, min2_i));
values.add(min1_v+min2_v);
keys.add(min_union_k);
System.out.println();
System.out.println(keys);
System.out.println(values);
}
This code will incrementally merge the two smallest (in frequency score) sets of primes together, until there is just 2 sets, keeping a record of this will provide us with an answer-key to how to optimally partition the possible primes when we want to ask a question to narrow down our options.
Here is the output, it starts with all the primes, each in their own set, and below each list is the corresponding frequency scores for each set, when two of the least-valued primes are merged, their scores are summed:
[[2], [3], [5], [7], [11], [13], [17], [19], [23], [29], [31], [37], [41], [43], [47], [53], [59], [61], [67], [71], [73], [79], [83], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163]] [83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] [[2], [3], [5], [7], [11], [19], [23], [29], [31], [37], [41], [43], [47], [53], [59], [61], [67], [71], [73], [79], [83], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17]] [83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2] [[2], [3], [5], [7], [11], [29], [31], [37], [41], [43], [47], [53], [59], [61], [67], [71], [73], [79], [83], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23]] [83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2] [[2], [3], [5], [7], [11], [37], [41], [43], [47], [53], [59], [61], [67], [71], [73], [79], [83], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31]] [83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2] [[2], [3], [5], [7], [11], [43], [47], [53], [59], [61], [67], [71], [73], [79], [83], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41]] [83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2] [[2], [3], [5], [7], [11], [53], [59], [61], [67], [71], [73], [79], [83], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47]] [83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2] [[2], [3], [5], [7], [11], [61], [67], [71], [73], [79], [83], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59]] [83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2] [[2], [3], [5], [7], [11], [71], [73], [79], [83], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67]] [83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2] [[2], [3], [5], [7], [11], [79], [83], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73]] [83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2] [[2], [3], [5], [7], [11], [89], [97], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83]] [83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2] [[2], [3], [5], [7], [11], [101], [103], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97]] [83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] [[2], [3], [5], [7], [11], [107], [109], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97], [101, 103]] [83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] [[2], [3], [5], [7], [11], [113], [127], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97], [101, 103], [107, 109]] [83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] [[2], [3], [5], [7], [11], [131], [137], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97], [101, 103], [107, 109], [113, 127]] [83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] [[2], [3], [5], [7], [11], [139], [149], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97], [101, 103], [107, 109], [113, 127], [131, 137]] [83, 28, 11, 7, 3, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] [[2], [3], [5], [7], [11], [151], [157], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97], [101, 103], [107, 109], [113, 127], [131, 137], [139, 149]] [83, 28, 11, 7, 3, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] [[2], [3], [5], [7], [11], [163], [13, 17], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97], [101, 103], [107, 109], [113, 127], [131, 137], [139, 149], [151, 157]] [83, 28, 11, 7, 3, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2] [[2], [3], [5], [7], [11], [19, 23], [29, 31], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97], [101, 103], [107, 109], [113, 127], [131, 137], [139, 149], [151, 157], [163, 13, 17]] [83, 28, 11, 7, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3] [[2], [3], [5], [7], [11], [37, 41], [43, 47], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97], [101, 103], [107, 109], [113, 127], [131, 137], [139, 149], [151, 157], [163, 13, 17], [19, 23, 29, 31]] [83, 28, 11, 7, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4] [[2], [3], [5], [7], [11], [53, 59], [61, 67], [71, 73], [79, 83], [89, 97], [101, 103], [107, 109], [113, 127], [131, 137], [139, 149], [151, 157], [163, 13, 17], [19, 23, 29, 31], [37, 41, 43, 47]] [83, 28, 11, 7, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 4] [[2], [3], [5], [7], [11], [71, 73], [79, 83], [89, 97], [101, 103], [107, 109], [113, 127], [131, 137], [139, 149], [151, 157], [163, 13, 17], [19, 23, 29, 31], [37, 41, 43, 47], [53, 59, 61, 67]] [83, 28, 11, 7, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 4, 4] [[2], [3], [5], [7], [11], [89, 97], [101, 103], [107, 109], [113, 127], [131, 137], [139, 149], [151, 157], [163, 13, 17], [19, 23, 29, 31], [37, 41, 43, 47], [53, 59, 61, 67], [71, 73, 79, 83]] [83, 28, 11, 7, 3, 2, 2, 2, 2, 2, 2, 2, 3, 4, 4, 4, 4] [[2], [3], [5], [7], [11], [107, 109], [113, 127], [131, 137], [139, 149], [151, 157], [163, 13, 17], [19, 23, 29, 31], [37, 41, 43, 47], [53, 59, 61, 67], [71, 73, 79, 83], [89, 97, 101, 103]] [83, 28, 11, 7, 3, 2, 2, 2, 2, 2, 3, 4, 4, 4, 4, 4] [[2], [3], [5], [7], [11], [131, 137], [139, 149], [151, 157], [163, 13, 17], [19, 23, 29, 31], [37, 41, 43, 47], [53, 59, 61, 67], [71, 73, 79, 83], [89, 97, 101, 103], [107, 109, 113, 127]] [83, 28, 11, 7, 3, 2, 2, 2, 3, 4, 4, 4, 4, 4, 4] [[2], [3], [5], [7], [11], [151, 157], [163, 13, 17], [19, 23, 29, 31], [37, 41, 43, 47], [53, 59, 61, 67], [71, 73, 79, 83], [89, 97, 101, 103], [107, 109, 113, 127], [131, 137, 139, 149]] [83, 28, 11, 7, 3, 2, 3, 4, 4, 4, 4, 4, 4, 4] [[2], [3], [5], [7], [163, 13, 17], [19, 23, 29, 31], [37, 41, 43, 47], [53, 59, 61, 67], [71, 73, 79, 83], [89, 97, 101, 103], [107, 109, 113, 127], [131, 137, 139, 149], [151, 157, 11]] [83, 28, 11, 7, 3, 4, 4, 4, 4, 4, 4, 4, 5] [[2], [3], [5], [7], [37, 41, 43, 47], [53, 59, 61, 67], [71, 73, 79, 83], [89, 97, 101, 103], [107, 109, 113, 127], [131, 137, 139, 149], [151, 157, 11], [163, 13, 17, 19, 23, 29, 31]] [83, 28, 11, 7, 4, 4, 4, 4, 4, 4, 5, 7] [[2], [3], [5], [7], [71, 73, 79, 83], [89, 97, 101, 103], [107, 109, 113, 127], [131, 137, 139, 149], [151, 157, 11], [163, 13, 17, 19, 23, 29, 31], [37, 41, 43, 47, 53, 59, 61, 67]] [83, 28, 11, 7, 4, 4, 4, 4, 5, 7, 8] [[2], [3], [5], [7], [107, 109, 113, 127], [131, 137, 139, 149], [151, 157, 11], [163, 13, 17, 19, 23, 29, 31], [37, 41, 43, 47, 53, 59, 61, 67], [71, 73, 79, 83, 89, 97, 101, 103]] [83, 28, 11, 7, 4, 4, 5, 7, 8, 8] [[2], [3], [5], [7], [151, 157, 11], [163, 13, 17, 19, 23, 29, 31], [37, 41, 43, 47, 53, 59, 61, 67], [71, 73, 79, 83, 89, 97, 101, 103], [107, 109, 113, 127, 131, 137, 139, 149]] [83, 28, 11, 7, 5, 7, 8, 8, 8] [[2], [3], [5], [163, 13, 17, 19, 23, 29, 31], [37, 41, 43, 47, 53, 59, 61, 67], [71, 73, 79, 83, 89, 97, 101, 103], [107, 109, 113, 127, 131, 137, 139, 149], [151, 157, 11, 7]] [83, 28, 11, 7, 8, 8, 8, 12] [[2], [3], [5], [71, 73, 79, 83, 89, 97, 101, 103], [107, 109, 113, 127, 131, 137, 139, 149], [151, 157, 11, 7], [163, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67]] [83, 28, 11, 8, 8, 12, 15] [[2], [3], [5], [151, 157, 11, 7], [163, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67], [71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149]] [83, 28, 11, 12, 15, 16] [[2], [3], [163, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67], [71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149], [5, 151, 157, 11, 7]] [83, 28, 15, 16, 23] [[2], [3], [5, 151, 157, 11, 7], [163, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149]] [83, 28, 23, 31] [[2], [163, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149], [5, 151, 157, 11, 7, 3]] [83, 31, 51] [[2], [163, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 5, 151, 157, 11, 7, 3]] [83, 82]
The ouptut shows the algorithm in action, essentially building the partition tree from the ground, in each pair of lists, the first is a list of groups of primes, the 2nd is the corresponding frequency value of each group. The final code will record each merging, each step, as a guide to how to partition any remaining group. It will map out the partition tree of the possible solutions in a fashion that is the optimal in average-case.
Question 3+
so we simply move the partition from left to right (in the ordered list of remaining possible least prime factors) until the left group has equal to or greater than half the total occurrences as solutions to the 165 numbers.
Here is the complete code implementing this average-case algorithm and applying it to each number over the range of this problem, computing average-case performance in the process.
The complete program in Python 3.1:
import time
t0=time.time()
prime_list = []
maxx = 166
for n in range(2, maxx+1):
isprime = True
for p in prime_list:
if n % p is 0:
isprime = False
break
if isprime == True:
prime_list.append(n)
def smallestPrimeFactor(n):
for p in prime_list:
if n % p == 0:
return p
prime_count={p : 0 for p in prime_list}
for n in range(2, maxx + 1):
prime_count[smallestPrimeFactor(n)] += 1
keys = []
values = []
in_sets = []
out_sets = []
for p in sorted(prime_count):
keys.append([p])
values.append(prime_count[p])
while len(keys) > 1:
min1_value = min(values)
min1_index = values.index(min1_value)
min1_key = keys[min1_index]
del values[min1_index]
del keys[min1_index]
min2_value = min(values)
min2_index = values.index(min2_value)
min2_key = keys[min2_index]
del keys[min2_index]
del values[min2_index]
in_sets.append(set(min1_key + min2_key))
out_sets.append(set(min1_key))
keys.append(min1_key + min2_key)
values.append(min1_value + min2_value)
#now that we have a partition answer-key (mapping of in_sets to out_sets)
#let's increment through the entire range of integers and compute the algorithm's performance
print("done with partition answer-key, time elapsed =", int(1000*(time.time()-t0)), "ms")
question_history = []
for n in range(2, maxx + 1):
prime_subset = set(prime_list)
question_count = 0
while True:
question_count += 1
checkset = out_sets[in_sets.index(prime_subset)]
if smallestPrimeFactor(n) in checkset:
prime_subset = checkset
else:
prime_subset = prime_subset-checkset
if len(prime_subset) == 1:
print( "number =", n, " smallest factor (>1) =", list(prime_subset)[0], " questions asked =", question_count)
question_history.append(question_count)
break
print( "worst case =", max(question_history), "questions" )
print( "best case =", min(question_history), "questoins" )
print( "average case =", sum(question_history), "/", (maxx-1), "~=",sum(question_history)/(maxx-1), "questions" )
print( "time elapsed =", int((time.time() - t0)*1000),"ms")
Alternatively, the complete Java program:
import java.util.*;
import java.lang.Math;
public class primes {
public static ArrayList<Integer> prime_list = new ArrayList<Integer>();
public static void main(String args[]){
Long t0 = System.currentTimeMillis();
int maxx = 166;
boolean isPrime;
for(int n=2; n <= maxx; n++){
isPrime = true;
for(int p : prime_list){
if(n % p == 0){
isPrime = false;
break;
}
}
if(isPrime){
prime_list.add(n);
}
}
Hashtable<Integer,Integer> prime_count = new Hashtable<Integer,Integer>();
ArrayList<Integer> values = new ArrayList<Integer>();
ArrayList<HashSet<Integer>> keys = new ArrayList<HashSet<Integer>>();
HashSet<Integer> prime_set = new HashSet<Integer>();
for(int p:prime_list){
prime_count.put(p,0);
prime_set.add(p);
}
for(int n=2; n <= maxx; n++){
prime_count.put(smallestPrimeFactor(n),prime_count.get(smallestPrimeFactor(n))+1);
}
for(int p:prime_list){
HashSet<Integer> temp = new HashSet<Integer>();
temp.add(p);
keys.add(temp);
values.add(prime_count.get(p));
}
ArrayList<HashSet<Integer>> in_sets = new ArrayList<HashSet<Integer>>();
ArrayList<HashSet<Integer>> out_sets = new ArrayList<HashSet<Integer>>();
while(keys.size()>1){
int tmp_index = 0;
int min1_v = 0;
int min2_v = 0;
int min1_i = 0;
int min2_i = 0;
HashSet<Integer> min1_k = new HashSet<Integer>();
HashSet<Integer> min2_k = new HashSet<Integer>();
HashSet<Integer> min_union_k = new HashSet<Integer>();
for(int v:values){
if(tmp_index == 0 | v < min1_v){
min2_v = min1_v;
min2_k = min1_k;
min2_i = min1_i;
min1_v = v;
min1_k = keys.get(tmp_index);
min1_i = tmp_index;
}else if(v < min2_v | tmp_index == 1){
min2_v = v;
min2_k = keys.get(tmp_index);
min2_i = tmp_index;
}
tmp_index++;
}
min_union_k.addAll(min1_k);
min_union_k.addAll(min2_k);
in_sets.add(min_union_k);
out_sets.add(min2_k);
values.remove(Math.max(min1_i, min2_i));
values.remove(Math.min(min1_i, min2_i));
keys.remove(Math.max(min1_i, min2_i));
keys.remove(Math.min(min1_i, min2_i));
values.add(min1_v+min2_v);
keys.add(min_union_k);
}
//now that we have a partition answer-key (mapping of in_sets to out_sets)
//let's increment through the entire range of integers and compute the algorithm's performance
System.out.println("done with partition answer-key, time elapsed = " + (System.currentTimeMillis()-t0) + " ms");
int max_questions = -1;
int min_questions = -1;
int questions_sum = 0;
for(int n=2; n <= maxx; n++){
HashSet<Integer> prime_subset = new HashSet<Integer>();
prime_subset.addAll(prime_set);
int question_count = 0;
int index = -1;
while(true){
question_count++;
index = in_sets.indexOf(prime_subset);
if( out_sets.get(index).contains(smallestPrimeFactor(n))){
prime_subset.clear();
prime_subset.addAll(out_sets.get(index));
}else{
prime_subset.removeAll(out_sets.get(index));
}
if(prime_subset.size() == 1){
System.out.println("number = " + n + " smallest factor (>1) = " + prime_subset.toArray()[0] + " questions asked = " + question_count);
if(question_count > max_questions){
max_questions = question_count;
}
if(question_count < min_questions || min_questions == -1){
min_questions = question_count;
}
questions_sum += question_count;
break;
}
}
}
System.out.println("worst case = " + max_questions + " questions");
System.out.println("best case = " + min_questions + " questions");
System.out.println("average case = " + questions_sum + " / " + (maxx-1) + " ~= " + (float) questions_sum/(maxx-1) + " questions");
System.out.println("time elapsed = " + (System.currentTimeMillis()-t0) + " ms");
}
public static int smallestPrimeFactor(int n){
for(int p: prime_list){
if(n % p == 0 ){
return p;
}
}
return 0;
}
}
The code could be optimized for speed (particularly when obtaining the index in the partition list (in_sets) to use on out_sets, currently it’s dumb, but could be improved with some sort of hash-map). With 166 integers, speed of execution was not an issue.
Both output:
done with partition answer-key, time elapsed = 0 ms number = 2 smallest factor (>1) = 2 questions asked = 1 number = 3 smallest factor (>1) = 3 questions asked = 3 number = 4 smallest factor (>1) = 2 questions asked = 1 number = 5 smallest factor (>1) = 5 questions asked = 4 number = 6 smallest factor (>1) = 2 questions asked = 1 number = 7 smallest factor (>1) = 7 questions asked = 5 number = 8 smallest factor (>1) = 2 questions asked = 1 number = 9 smallest factor (>1) = 3 questions asked = 3 number = 10 smallest factor (>1) = 2 questions asked = 1 number = 11 smallest factor (>1) = 11 questions asked = 6 number = 12 smallest factor (>1) = 2 questions asked = 1 number = 13 smallest factor (>1) = 13 questions asked = 7 number = 14 smallest factor (>1) = 2 questions asked = 1 number = 15 smallest factor (>1) = 3 questions asked = 3 number = 16 smallest factor (>1) = 2 questions asked = 1 number = 17 smallest factor (>1) = 17 questions asked = 7 number = 18 smallest factor (>1) = 2 questions asked = 1 number = 19 smallest factor (>1) = 19 questions asked = 7 number = 20 smallest factor (>1) = 2 questions asked = 1 number = 21 smallest factor (>1) = 3 questions asked = 3 number = 22 smallest factor (>1) = 2 questions asked = 1 number = 23 smallest factor (>1) = 23 questions asked = 7 number = 24 smallest factor (>1) = 2 questions asked = 1 number = 25 smallest factor (>1) = 5 questions asked = 4 number = 26 smallest factor (>1) = 2 questions asked = 1 number = 27 smallest factor (>1) = 3 questions asked = 3 number = 28 smallest factor (>1) = 2 questions asked = 1 number = 29 smallest factor (>1) = 29 questions asked = 7 number = 30 smallest factor (>1) = 2 questions asked = 1 number = 31 smallest factor (>1) = 31 questions asked = 7 number = 32 smallest factor (>1) = 2 questions asked = 1 number = 33 smallest factor (>1) = 3 questions asked = 3 number = 34 smallest factor (>1) = 2 questions asked = 1 number = 35 smallest factor (>1) = 5 questions asked = 4 number = 36 smallest factor (>1) = 2 questions asked = 1 number = 37 smallest factor (>1) = 37 questions asked = 7 number = 38 smallest factor (>1) = 2 questions asked = 1 number = 39 smallest factor (>1) = 3 questions asked = 3 number = 40 smallest factor (>1) = 2 questions asked = 1 number = 41 smallest factor (>1) = 41 questions asked = 7 number = 42 smallest factor (>1) = 2 questions asked = 1 number = 43 smallest factor (>1) = 43 questions asked = 7 number = 44 smallest factor (>1) = 2 questions asked = 1 number = 45 smallest factor (>1) = 3 questions asked = 3 number = 46 smallest factor (>1) = 2 questions asked = 1 number = 47 smallest factor (>1) = 47 questions asked = 7 number = 48 smallest factor (>1) = 2 questions asked = 1 number = 49 smallest factor (>1) = 7 questions asked = 5 number = 50 smallest factor (>1) = 2 questions asked = 1 number = 51 smallest factor (>1) = 3 questions asked = 3 number = 52 smallest factor (>1) = 2 questions asked = 1 number = 53 smallest factor (>1) = 53 questions asked = 7 number = 54 smallest factor (>1) = 2 questions asked = 1 number = 55 smallest factor (>1) = 5 questions asked = 4 number = 56 smallest factor (>1) = 2 questions asked = 1 number = 57 smallest factor (>1) = 3 questions asked = 3 number = 58 smallest factor (>1) = 2 questions asked = 1 number = 59 smallest factor (>1) = 59 questions asked = 7 number = 60 smallest factor (>1) = 2 questions asked = 1 number = 61 smallest factor (>1) = 61 questions asked = 7 number = 62 smallest factor (>1) = 2 questions asked = 1 number = 63 smallest factor (>1) = 3 questions asked = 3 number = 64 smallest factor (>1) = 2 questions asked = 1 number = 65 smallest factor (>1) = 5 questions asked = 4 number = 66 smallest factor (>1) = 2 questions asked = 1 number = 67 smallest factor (>1) = 67 questions asked = 7 number = 68 smallest factor (>1) = 2 questions asked = 1 number = 69 smallest factor (>1) = 3 questions asked = 3 number = 70 smallest factor (>1) = 2 questions asked = 1 number = 71 smallest factor (>1) = 71 questions asked = 7 number = 72 smallest factor (>1) = 2 questions asked = 1 number = 73 smallest factor (>1) = 73 questions asked = 7 number = 74 smallest factor (>1) = 2 questions asked = 1 number = 75 smallest factor (>1) = 3 questions asked = 3 number = 76 smallest factor (>1) = 2 questions asked = 1 number = 77 smallest factor (>1) = 7 questions asked = 5 number = 78 smallest factor (>1) = 2 questions asked = 1 number = 79 smallest factor (>1) = 79 questions asked = 7 number = 80 smallest factor (>1) = 2 questions asked = 1 number = 81 smallest factor (>1) = 3 questions asked = 3 number = 82 smallest factor (>1) = 2 questions asked = 1 number = 83 smallest factor (>1) = 83 questions asked = 7 number = 84 smallest factor (>1) = 2 questions asked = 1 number = 85 smallest factor (>1) = 5 questions asked = 4 number = 86 smallest factor (>1) = 2 questions asked = 1 number = 87 smallest factor (>1) = 3 questions asked = 3 number = 88 smallest factor (>1) = 2 questions asked = 1 number = 89 smallest factor (>1) = 89 questions asked = 7 number = 90 smallest factor (>1) = 2 questions asked = 1 number = 91 smallest factor (>1) = 7 questions asked = 5 number = 92 smallest factor (>1) = 2 questions asked = 1 number = 93 smallest factor (>1) = 3 questions asked = 3 number = 94 smallest factor (>1) = 2 questions asked = 1 number = 95 smallest factor (>1) = 5 questions asked = 4 number = 96 smallest factor (>1) = 2 questions asked = 1 number = 97 smallest factor (>1) = 97 questions asked = 7 number = 98 smallest factor (>1) = 2 questions asked = 1 number = 99 smallest factor (>1) = 3 questions asked = 3 number = 100 smallest factor (>1) = 2 questions asked = 1 number = 101 smallest factor (>1) = 101 questions asked = 7 number = 102 smallest factor (>1) = 2 questions asked = 1 number = 103 smallest factor (>1) = 103 questions asked = 7 number = 104 smallest factor (>1) = 2 questions asked = 1 number = 105 smallest factor (>1) = 3 questions asked = 3 number = 106 smallest factor (>1) = 2 questions asked = 1 number = 107 smallest factor (>1) = 107 questions asked = 7 number = 108 smallest factor (>1) = 2 questions asked = 1 number = 109 smallest factor (>1) = 109 questions asked = 7 number = 110 smallest factor (>1) = 2 questions asked = 1 number = 111 smallest factor (>1) = 3 questions asked = 3 number = 112 smallest factor (>1) = 2 questions asked = 1 number = 113 smallest factor (>1) = 113 questions asked = 7 number = 114 smallest factor (>1) = 2 questions asked = 1 number = 115 smallest factor (>1) = 5 questions asked = 4 number = 116 smallest factor (>1) = 2 questions asked = 1 number = 117 smallest factor (>1) = 3 questions asked = 3 number = 118 smallest factor (>1) = 2 questions asked = 1 number = 119 smallest factor (>1) = 7 questions asked = 5 number = 120 smallest factor (>1) = 2 questions asked = 1 number = 121 smallest factor (>1) = 11 questions asked = 6 number = 122 smallest factor (>1) = 2 questions asked = 1 number = 123 smallest factor (>1) = 3 questions asked = 3 number = 124 smallest factor (>1) = 2 questions asked = 1 number = 125 smallest factor (>1) = 5 questions asked = 4 number = 126 smallest factor (>1) = 2 questions asked = 1 number = 127 smallest factor (>1) = 127 questions asked = 7 number = 128 smallest factor (>1) = 2 questions asked = 1 number = 129 smallest factor (>1) = 3 questions asked = 3 number = 130 smallest factor (>1) = 2 questions asked = 1 number = 131 smallest factor (>1) = 131 questions asked = 7 number = 132 smallest factor (>1) = 2 questions asked = 1 number = 133 smallest factor (>1) = 7 questions asked = 5 number = 134 smallest factor (>1) = 2 questions asked = 1 number = 135 smallest factor (>1) = 3 questions asked = 3 number = 136 smallest factor (>1) = 2 questions asked = 1 number = 137 smallest factor (>1) = 137 questions asked = 7 number = 138 smallest factor (>1) = 2 questions asked = 1 number = 139 smallest factor (>1) = 139 questions asked = 7 number = 140 smallest factor (>1) = 2 questions asked = 1 number = 141 smallest factor (>1) = 3 questions asked = 3 number = 142 smallest factor (>1) = 2 questions asked = 1 number = 143 smallest factor (>1) = 11 questions asked = 6 number = 144 smallest factor (>1) = 2 questions asked = 1 number = 145 smallest factor (>1) = 5 questions asked = 4 number = 146 smallest factor (>1) = 2 questions asked = 1 number = 147 smallest factor (>1) = 3 questions asked = 3 number = 148 smallest factor (>1) = 2 questions asked = 1 number = 149 smallest factor (>1) = 149 questions asked = 7 number = 150 smallest factor (>1) = 2 questions asked = 1 number = 151 smallest factor (>1) = 151 questions asked = 7 number = 152 smallest factor (>1) = 2 questions asked = 1 number = 153 smallest factor (>1) = 3 questions asked = 3 number = 154 smallest factor (>1) = 2 questions asked = 1 number = 155 smallest factor (>1) = 5 questions asked = 4 number = 156 smallest factor (>1) = 2 questions asked = 1 number = 157 smallest factor (>1) = 157 questions asked = 7 number = 158 smallest factor (>1) = 2 questions asked = 1 number = 159 smallest factor (>1) = 3 questions asked = 3 number = 160 smallest factor (>1) = 2 questions asked = 1 number = 161 smallest factor (>1) = 7 questions asked = 5 number = 162 smallest factor (>1) = 2 questions asked = 1 number = 163 smallest factor (>1) = 163 questions asked = 6 number = 164 smallest factor (>1) = 2 questions asked = 1 number = 165 smallest factor (>1) = 3 questions asked = 3 number = 166 smallest factor (>1) = 2 questions asked = 1 worst case = 7 questions best case = 1 questions average case = 494 / 165 ~= 2.99393939394 questions time elapsed = 18 ms
Here we see that we can figure out the smallest positive integer divisor greater than 1 in under 3 questions on average with this methodology, even though the worst case is a little worse than our algorithm for part 1 of this puzzle.
I tried to keep the Java and Python code comparable (although not always asymptotically optimal) so we could have a rough comparison of performance.
With 166 numbers, python does the job a little faster (18ms vs. 58ms), but for 16,600 numbers, Java is about 50% faster. Java becomes an order of magnitude faster as we get to 166,000 numbers.
This methodology impressively determines the smallest prime factor of any positive integer from 2 to 166,000 in ~4.04 questions on average (that’s over 15,000 primes)!
This entry was posted on Monday, March 15th, 2010 at 19:55 and is filed under Uncategorized. Find similar posts by selecting any of the following tags: . You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.
on 17.03.2010 at 17:20 Whitney Sorenson wrote:
First of all, you can clean up your java:
import java.util.ArrayList; public class primes { public static void main(String args[]) { ArrayList prime_list = new ArrayList(); for (int n = 2; n <= 166; n++) { boolean isprime = true; for (int p : prime_list) { if ((n / p) == (float) n / p) { isprime = false; break; } } if (isprime) { prime_list.add(n); } } System.out.println("the primes less than 167 are:"); System.out.println(prime_list); System.out.println("there are " + prime_list.size() + " of them"); }But really, you should use a sieve of eratosthenes:
import java.util.ArrayList; public class primes { public static void main(String args[]) { ArrayList prime_list = new ArrayList(); int upperBoundSquareRoot = (int) Math.sqrt(166); boolean[] isComposite = new boolean[166 + 1]; for (int m = 2; m <= upperBoundSquareRoot; m ) { if (!isComposite[m]) { prime_list.add(m); for (int k = m * m; k <= 166; k = m) isComposite[k] = true; } } for (int m = upperBoundSquareRoot; m <= 166; m++) if (!isComposite[m]) prime_list.add(m); System.out.println("the primes less than 167 are:"); System.out.println(prime_list); System.out.println("there are " + prime_list.size() + " of them"); } }on 18.03.2010 at 03:29 stephen wrote:
The sieve of eratosthenes (I never knew the name until now) is asymptotically more efficient in cpu use, although it’s a bit more complex to program and requires memory use.
It’s nearly instant either way, but do you think in this case (first 38 primes) it makes up for the extra cost of creating/allocating the boolean array?
The density of primes decreases as you get further down the number line ( it’s like ~1/ln(n) ), so I could see this algorithm becoming a must-use for finding a larger number of primes. (although lower density of primes means higher memory use per prime for the sieve of eratosthenes as we look for larger and larger primes)
Thanks.
btw didn’t know you could iterate through an ArrayList with that syntax like you can in python with the for loop. nice.
on 18.03.2010 at 14:41 stephen wrote:
I saw in python n%d is mod division, and yields the remainder, this could/should be much faster than comparing the float of the fraction with the floor of the float of the fraction, which is a round-about way to check if d divides n evenly.
When finding the first 15,173 primes, the ‘float’ to the ‘floor of float’ comparison took 129 seconds, mod-division took 34 seconds.
specifically :
import time t0=time.time() prime_list = [] maxx = 166000 for n in range(2,maxx+1): isprime = True for p in prime_list: if n/p == int (n/p): isprime = False; break; if isprime == True: prime_list.append(n) print("the primes less than",maxx+1,"are:") #print(prime_list) print("there are ", len(prime_list), "of them") print((time.time()-t0),"secs")vs
import time t0=time.time() prime_list = [] maxx = 166000 for n in range(2,maxx+1): isprime = True for p in prime_list: if n%p is 0; isprime = False; break; if isprime == True: prime_list.append(n) print("the primes less than",maxx+1,"are:") #print(prime_list) print("there are ", len(prime_list), "of them") print((time.time()-t0),"secs")oh yeah, in java the the times were 1.6 seconds vs 0.98 seconds respectively.(!)
on 18.03.2010 at 18:45 stephen wrote:
so just for fun i ran a couple tests with the sieve of eratosthenes vs the brute mod division algorithm i used,
public class primes { public static ArrayList<Integer> prime_list = new ArrayList<Integer>(); public static void main(String args[]){ int maxx = 166; long t0 = System.nanoTime(); boolean[] excludearray = new boolean[maxx+1]; for(int n=2;n<=maxx;n++){ if(excludearray[n] == false){ prime_list.add(n); for(int m=2; m<=maxx/n;m++){ excludearray[n*m] = true; } } } //System.out.println("the primes less than 167 are:"); //System.out.println(prime_list); //System.out.println("there are " prime_list.size() " of them"); long time1 = System.nanoTime()-t0; prime_list = new ArrayList<Integer>(); long t2 = System.nanoTime(); boolean isPrime; for(int n=2; n <= maxx; n++){ isPrime = true; for(int p : prime_list){ if(n % p == 0){ isPrime = false; break; } } if(isPrime){ prime_list.add(n); } } //System.out.println("the primes less than 167 are:"); //System.out.println(prime_list); //System.out.println("there are " prime_list.size() " of them"); long time2 = System.nanoTime()-t2; System.out.println(time1 + " nanoseconds for sieve for primes under " + (maxx+1)); System.out.println(time2 + " nanoseconds for brute for primes under "+ (maxx+1));the output is:
so the brute force is slightly faster for the first 38 primes, but lets look at a larger set of primes, I imagine that boolean array will do far more than make up for its creation-time-cost:
for the first 15,173 primes: