CPSC 352 -- Artificial Intelligence -- Fall 2005
Machine Learning Practice Problems -- Due: Friday, 11/18

1. Create a decision tree for the following table of cases. You are trying to decide the heart attach risk. For each level of the tree, choose properties in alphabetical order.
NRiskWGTHGTCollesterol
1highhighshortmore than 200
2highhighmedmore than 200
3medhightall100-200
4lowhightallless than 100
5lowmedtall100-200
6medmedshort100-200
7lowmedshortless than 100
8highmedmedmore than 200
9medlowshortmore than 200
10medlowtallmore than 200
11lowlowmedless than 100
12medlowmedmore than 200
Will do in the Q/A session.

2. Show that the McCulloch-Pitts neuron with the following activation function will not calculate the logic function implies: -x + y + 1. Assume that the threshold is 0 -- that is, the neuron fires when its activation is greater than or equal to 0. And assume that the inputs are either 1 or 0.

The following table shows that the functions does not implement implies, because every output is 1. The output from the second row of the table should be 0.
XY-X+Y+1Output
1111
1001
0121
0011

3. Design a McCulloch-Pitts neuron to calculate the logic function implies. HINT: Draw the truth table for implies. For each row of the table, make an inequality in which the left hand side has the following form: w1 * X + w2 * Y + c. For the first row, you would get the inequality: w1 + w2 + c >= 0 because X=1, Y=1, and the function is true in that case. Solve the four inequalities for w1, w2, and c.

The following table, with the function -2X+Y+1 implements the boolean implies function. It's output matches the desired output for implies. To derive the weights for X, Y and C, you must solve the following inequalities:

w1 + w2 + C >= 0
w1      + C < 0
     w2 + C >= 0
          C >= 0

If we let C be 1, then w1 must be less than -1. So let w1 be -2. Then w2 must be greater than or equal to 1. So let w2 be 1. NOTE: These are not the only weights that will work.
XY-2X+Y+1Output
1101
10-10
0121
0011

4. For the XOR network in figure 10.12 on page 437, determine whether the following weight parameters provide a network that recognizes XOR: WH1 = -2.0, WH2 = -2.0, WHB = 2.5, WOB=2.0, W01= -5.0m W02= - 4.0, WOH=-11.0.

The following table gives the values for the hidden node (H) and the network's output (O). Since all the outputs are 0, this is not the XOR function.
I1I2HOutput
11-1.5-5-4+2=-7 outputs 0
100.5-11-5+2=-14 outputs 0
010.5-11-4+2=-13 outputs 0
002.5-11+2=-9 outputs 0

5. Suppose you are trying to solve the following bin-packing problem. You have a collection of boxes numbers 0, 1, 2, ..., n-1. Each box has an associated weight. You have a colleciton of containers, each of which has a maximum weight of W. The idea is to place boxes into a container, never exceeding its capacity but filling it as closely as possible to its capacity. For example, suppose b1=10, b2=15, and b3=10. and W=20. If you place b1 and b2 into the container, you have 5 pounds of unused capacity. However, if you place b1 and b3 in the container, you have achieved an optimal packing. This is the kind of optimization problem that genetic algorithms can be handle.

Develop a binary function that can be used by a GA for this problem. Illustrate its use with an example involving a small number of boxes. Show that your representation would allow both crossover and mutation.

For each box, b0, b1, ..., let the chromosome have a 1 in position 0, 1, ... if the corresponding box is in the container and a 0 otherwise. So for boxes b0 through b3, 0010 means that only b2 is in the container. 1111 means that all four boxes are in the container. If you perform crossover in the middle of 0010 and 1001 you get 1010 and 0001, which are both valid chromosomes. If you mutate 1111 to 1101 then boxes b0,b1, and b3 are in the container, which is a valid packing.

6. For the same GA as in the previous problem, describe an evaluation function that can be used to determine the fitness of the different chromosomes. In other words, given your representation in problem 5, describe a way to distinguish a good fit from a bad fit. Obviously a chromosome in which the total weight of the container exceeds its capacity is a bad fit. One that perfectly fills the container is a perfect fit. (You can assume that the program has a separate data structure to store the weights of the individual boxes so that b1's weight can be looked up in a table, and so on.

Given the above representation, we determine fitness by computing the sum of the weights of the boxes in the container. For chromosome 0110, we sum the weights of b1 and b2. These weights can be stored in a table that is indexed by the index of the bit in the chromosome. If the capacity of the container is C and the sum of the weights is W, then if C-W = 0, that is an optimal fit. If W is greater than C, then that is evaluated as infinity, the worst possible value. If C is greater than W, then C-W gives the fit. Thus, the closer to 0, the better the fit.