## Abstract

Animals seek information to reduce their efforts to receive rewards and perform actions that enable them to gain more information. The ability of seeking information subserves higher cognition processes such as planning and reasoning. There exists limited information on how the brain measures and seeks information. In this study, I discuss results indicating that the brain quantifies information by using the information-theoretic measure. The monkeys were trained to perform saccadic eye movement to one of the visual targets. When required to choose from the targets that included varying amounts of information regarding the goal, the animals selected the most informative target. While making a choice, the neurons in the dorsal premotor cortex exhibited activity that reflected the corresponding information value. The population response of these neurons was examined using the following three measures: the information-theoretic measure, probability gain, and absolute change in beliefs. Changes in this response exhibited relatively similar proportionality to the three measures. An analysis of two intuitive conditions for information measures, decreasing monotonicity on probability and additivity between independent events, showed that only the information-theoretic measure satisfies both the conditions. These results suggest that in comparison with the other measures, the information-theoretic measure is more plausible for information measure in the brain.

## INTRODUCTION

Animals, particularly humans, continually seek not only greater rewards but also more information. How do they measure information? Information can be quantified by using information entropy (Shannon and Weaver 1949). For example, when identifying a playing card, the number on the card is a more informative clue than the suit of the card. This notion is quantified as follows: the number and suit carry information of –log(1/13) and –log(1/4), respectively, and –log(1/13) is greater than –log(1/4). Together, these two clues identify the card; therefore the sum of the information they contain should be equal to the information required to identify the card, which is –log(1/52). In fact, –log(1/13) + [–log(1/4)] = –log(1/52). This indicates that the information-theoretic measure is consistent with human intuition with regard to information value, that is, the amount of information decreases with the increasing probabilities of predicted events and is additive between stochastically independent events. This consistency suggests that quantities related to the information-theoretic measure are used by the brain to quantify information.

The information-theoretic measure has been applied to different issues, for example, human category learning has been explained using information entropy (Corter and Gluck 1992). This measure was used to formulate the hypothesis-testing process (Nelson 2005). It is also useful for learning and representing human knowledge in machine intelligence (Nakamura et al. 1983; Quinlan 1983). A neural system has been shown to adaptively maximize information transmission by spike trains (Fairhall et al. 2001). These also suggest that the brain may quantify information using the information-theoretic measure.

Recent studies have shown that the dorsal premotor cortex (PMd) is involved in motor control based on higher cognitive functions, such as mental rehearsal (Cisek and Kalaska 2004) and memory and generation of motor sequences (Ohbayashi et al. 2003). This suggests that the PMd may be involved in another higher function of measuring information.

The information measure is calculated from probability values. The dopamine neurons of the ventral midbrain show activity that reflects probabilistic uncertainty (Fiorillo et al. 2003). According to functional magnetic resonance imaging studies, the activity in the human midbrain region is modulated by the probabilistic uncertainty (Volz et al. 2003) and information entropy (Aron et al. 2004). The dopamine neurons project to the PMd (Berger et al. 1991; Ilinsky et al. 1985; Williams and Goldman-Rakic 1993). The neurons of the lateral intraparietal area are sensitive to the expected reward probability (Platt and Glimcher 1999). This area is connected with the PMd through a neural pathway via the prefrontal cortex (Andersen et al. 1985; Cavada and Goldman-Rakic 1989; Lu et al. 1994). This also suggests that the PMd may be involved in computing the information measure by using signals from the areas that project onto it. In this study, I explored the possibility of measuring neural information by recording from single neurons in the PMd of monkeys.

## METHODS

The experiment involved two monkeys (*Macaca fuscata*). A head-restraint prosthesis was implanted in an aseptic surgical procedure. The monkeys were tranquilized with ketamine (10 mg/kg), administered atropine, and subsequently anesthetized with pentobarbital sodium (4.5–6.0 mg · kg^{−1} · h^{−1} iv). Once the monkeys could reliably execute all the behavioral tasks involved in the study, a 15-mm craniotomy was performed, and a Delrin recording chamber was mounted. All the surgical and experimental protocols were approved by the Animal Care and Use Committee of Tokyo Institute of Technology and were in accordance with the Japan Neuroscience Society Guidelines for Care and Use of Laboratory Animals in Neuroscience.

### Behavioral task

The monkeys were trained to perform three tasks with varying expected amounts of information. They were seated in a primate chair facing a computer display. The eye position was monitored with an infrared eye-tracking system. To begin a run of trials for task A, the monkey fixated on a central cross (size, 1°) presented on a black background for 200 ms (Fig. 1). During fixation, the monkey had to maintain its gaze within 2° of the fixation point. Subsequently, six white dots (size, 1°) appeared around the cross at an eccentricity of 7°. One dot was randomly designated as the reward target, whereas the others were distractors. The monkey maintained fixation on the central cross for 1,000 ms. The disappearance of the fixation cross cued the selection of a dot by saccadic eye movement. It was able to choose from among the six dots. When it fixated on a dot for 300 ms, the dot was regarded as its choice. When the chosen dot was a distractor, it turned green and the central cross reappeared. The monkey fixated on the cross and maintained the fixation for 1,000 ms. The disappearance of the fixation cross cued the selection of another dot. This exercise was repeated until the monkey chose the reward target. When the chosen dot was the reward target, the dot turned red, all distractors turned green, and the animal received 0.3 ml drop of water. The location of the reward target was randomized and the next trial was begun.

Task B was identical to task A with the exception of the fact that the six dots included another target referred to as the “informative” target. When the monkey selected the informative target as the first choice, all the distractors turned green, while the reward target remained white, thus revealing the reward target. When it selected the informative target, it merely fixated on the central cross and performed a saccade to the white dot to receive the reward. If the informative target was also the reward target, the monkey was rewarded when it first selected the target. In each trial block, the informative dot remained the same, whereas the reward target was randomized. In task B, the monkey was aware of the location of the informative target but was unaware of that of the reward target. The expected amount of information was greater for the informative target than for the other dots, whereas the expected reward was the same for all the dots. In task A, the expected amount of information was the same for all the dots, which was equal to that of the other dots than the informative target in task B.

In task C, the reward target was located at the same site as the informative target in task B. In this task, the first saccade did not provide the monkey with any information because the reward target was known before the first saccade.

First, the monkey performed three tasks: task A; task B, where the lower right dot was the informative target; and task C, where the lower right dot was the reward target. Each task was provided as a trial block, and it ended when the animal chose the lower right dot at the first saccade 20 times. The first fixation of these 20 trials shared an identical visual stimulus configuration and was followed by the same motor response in all the tasks. The order of the tasks was randomized for recording different units. Second, the monkey performed task B in which either the top dot or the lower left dot was informative. This task ended when it chose the informative target at the first saccade 20 times. In these 20 trials, the first cross fixation shared an identical stimulus configuration but was followed by different motor responses. Of the 20 trials performed in each task, the last 15 trials were used for unit recording to ensure that the monkey recognized the task that it had performed. For recording each unit, it performed ∼60 trials of task A, 40 trials of task B, and 20 trials of task C (In the early trials of each task A, it selected different dots as the first choice but then came to select the same dot first. It appeared to stop looking for the informative target and the reward target at the first saccade.).

### Neurophysiological recording

The activity of single neurons was recorded from the right premotor cortex of *monkey A* and from the left premotor cortex of *monkey B* by using tungsten microelectrodes (FHC). The neurons were randomly selected; no attempt was made to search for a task-related activity. Waveform separation was performed off-line using a template-matching spike sorter (Spike2, CED). After the recording was completed, the animals were killed and then perfused with a fixative. During perfusion, 2 pins were inserted at a distance of 7 mm from each other at known coordinates to aid in the localization of the recording site. The recording sites were plotted on the basis of the positions of these pins.

### Information measure

The expected information during the first fixation was calculated as follows: Let *p _{i}* denote the probability that the

*i*th dot is the reward target (

*i*= 1,…, 6). In task A,

*p*is 1/6 for any dot before the first choice is made. Therefore the information entropy is –log

_{i}_{2}(1/6) (bits). If the chosen dot is a distractor,

*p*becomes 0 for the chosen dot and 1/5 for the other dots, thereby decreasing the information entropy to –log

_{i}_{2}(1/5). If the chosen dot is the reward target,

*p*becomes 1 for the chosen dot and 0 for the other dots, and the information entropy becomes 0. Because the probability that the chosen dot is a distractor is 5/6, the expected information (that is, the expected decrement in the information entropy) is –log

_{i}_{2}(1/6) –(5/6)[–log

_{2}(1/5)], which is nearly 0.65 bits. The expected information is the same for all the dots. In task B, for all the dots other than the informative target, the expected information is the same as described in the preceding text. If the informative target is chosen,

*p*becomes 1 for the reward target and 0 for the other dots because the reward target is revealed to the monkey. The information entropy then decreases to 0. Therefore the expected information is –log

_{i}_{2}(1/6), which is nearly 2.58 bits. In task C,

*p*is always 1 for the reward target and 0 for the other dots; therefore the expected information is 0.

_{i}### Alternative measures of information

I also tested the following two information measures (Nelson 2005): the expected improvement in the probability of correctly identifying the true target (probability gain) and the expected absolute change in beliefs about the location of the true target (impact).

Probability gain is defined by the following equation (1) where Pg(*Q*), *q*, and *h* denote the probability gain that is obtained by question *Q*, answers to *Q*, and the hypotheses to be tested, respectively; *P*(*q*), *P*(*h*|*q*), and *P*(*h*) are the probability of *q*, the conditional probability of *h* in the case of *q*, and the prior probability of *h*, respectively. With regard to the first choice in task A, *Q* was a saccade to the lower right dot that asked whether the dot was the reward target, *q* was yes and no, and *h* was that the *i*th dot was the reward target (*i* = 1,…, 6). *P*(*q*) is 1/6 for *q* = yes and 5/6 for *q* = no. *P*(*h*|yes) is 1 for *h* that the lower right dot is the reward target and 0 for the other dots. *P*(*h*|no) is 0 for *h* that the lower right dot is the reward target and 1/5 for the other dots. *P*(*h*) is 1/6 for any *h*. Consequently, Pg(*Q*) = (1/6)max[1, 0,…, 0] + (5/6)max[0, 1/5,…, 1/5] –max[1/6,…, 1/6] =1/6. In task B, *P*(*q*), *P*(*h*|yes), and *P*(*h*) are the same as in task A. *P*(*h*|no) is 1 for *h* that the dot which is the reward target is the reward target and 0 for the other dots. Pg(*Q*) = (1/6)max[1, 0,…, 0] + (5/6)max[1, 0,…, 0] –max[1/6,…, 1/6] = 5/6. In task C, *P*(*q*) is 1 for *q* = yes and 0 for *q* = no. *P*(*h*|yes) and *P*(*h*) are 1 for *h* that the lower right dot is the reward target and 0 for the other dots. Pg(*Q*) = (1)max[1, 0,…, 0] –max[1, 0,…, 0] = 0.

Impact is defined by the following equation (2) where Im(*Q*) and abs denote the impact of *Q* and the absolute value, respectively, and *n* is the number of hypotheses *h*. From this, we obtain the values of impact for the first choice in task A as follows: Im(*Q*) = (1/6)(abs[1 –1/6] +5abs[0 –1/6])/6 + (5/6)(abs[0 –1/6] +5abs[1/5 –1/6])/6 = 5/54. In task B, Im(*Q*) = (1/6)(abs[1 –1/6] +5abs[0 –1/6])/6 + (5/6)(abs[1 –1/6] +5abs[0 –1/6])/6 = 5/18. In task C, Im(*Q*) = (1)(abs[1 –1] +5abs[0 –0])/6 = 0. All three information measures are the greatest in task B and the least in task C.

### State value in reinforcement learning

In addition to the information measures, I tested the state value that is the net present value of the discounted future reward. The state value indicates the proximity of the present state to the reward in the reinforcement learning theory (Sutton and Barto 1998). It is given as *E*[Σ_{j} *d*^{j} *R _{j}*], where

*R*(j = 0, 1, 2,…) denotes the reward obtained by the (

_{j}*j*+1)th choice in each trial, and

*d*is the discount rate (0 <

*d*< 1) that indicates the subject’s concern regarding the future reward. In the present study,

*R*was 0 or a drop of water depending on the monkeys’ selections. In task A, the probability that the monkey selects the reward target as the first choice is 1/6. Therefore

_{j}*E*[

*R*

_{0}] = R/6, where R denotes 0.3 ml drop of water. The probability that it selects the reward target as the second choice is a product of the probability that the first choice is a distractor, 5/6, and the probability that it selects the reward target from the remaining five white dots as the second choice, 1/5. Therefore

*E*[

*dR*

_{1}] =

*d*(5/6)(1/5)

*R*=

*d*(R/6). Similarly,

*E*[

*d*

^{2}

*R*

_{2}] =

*d*

^{2}(R/6),…,

*E*[

*d*

^{5}

*R*

_{5}] =

*d*

^{5}(R/6). From these, we obtain

*E*[Σ

_{j}

*d*

^{j}

*R*] = (1 +

_{j}*d*+

*d*

^{2}+

*d*

^{3}+

*d*

^{4}+

*d*

^{5})(R/6). In task B, the monkeys almost always chose the informative target as the first choice. Assume that this probability is 1. The probability that the informative target is the reward target is 1/6. Therefore

*E*[

*R*

_{0}] = R/6. The probability that the monkey selects the reward target as the second choice is a product of the probability that the informative target is not the reward target, 5/6, and the probability that it selects the remaining white dot, 1.

*E*[

*dR*

_{1}] =

*d*(5/6)R. We obtain

*E*[Σ

_{j}

*d*

^{j}

*R*] = (1 + 5

_{j}*d*)(R/6). In task C, the monkeys almost always chose the reward target as the first choice. Assuming that this probability is 1,

*E*[

*R*

_{0}] = R. We obtain

*E*[Σ

_{j}

*d*

^{j}

*R*] = R. By normalizing the state values to value

_{j}*R,*we obtain (1 +

*d*+

*d*

^{2}+

*d*

^{3}+

*d*

^{4}+

*d*

^{5})/6 for task A, (1 + 5

*d*)/6 for task B, and 1 for task C. Because 0 <

*d*< 1, the normalized state values are the greatest in task C and the least in task A.

### Data analysis

The discharge rates of the neural activity recorded during the first cross fixation that was followed by saccades to the lower right dot were compared across the three tasks using the Mann-Whitney *U* test. In addition, in task B, the neural activity during the first cross fixation was compared between different locations of the informative target using the Mann-Whitney *U* test. The neural activity during the sixth cross fixation that was followed by saccades to any dot in task A was compared with the neural activity during the first cross fixation that was followed by saccades to the lower right dot in task C. Further, the former activity was compared with the neural activity during the first cross fixation that was followed by saccades to the lower right dot in task A. These comparisons were performed using the Mann-Whitney *U* test. Unless otherwise stated, *P* < 0.01 was considered to indicate statistical significance.

To evaluate information-related activity across the population, the mean response of neurons that correlated with the information measures was calculated. This was done following the normalization within each neuron to the response to task B for increasing responses and to task C for decreasing responses because the information measures were the greatest in task B and the least in task C. It follows that the normalized increasing and decreasing responses were 1 for both tasks B and C, respectively. I examined whether the population responses of information-measuring neurons were fitted by straight lines *Y* = α(*X* –*X*_{o}) +1, where *X* and *Y* were the values of information measures and normalized responses, respectively, *X*_{o} was the values in task B for increasing responses and in task C for decreasing responses, and α was coefficients. The values of α were obtained by minimizing the mean square error of the data from the straight lines. Using the *F* test, the null hypothesis that *Y* was independent of *X* was tested against the alternative that the straight line *Y* = α(*X* –*X*_{o}) +1 fitted the data for each of the three information measures. The normalized responses were also compared between the three tasks using the Mann-Whitney *U* test.

I also examined the mean response of the neurons that correlated with the state values. Because the normalized state values are the greatest in task C and the least in task A, the response of each neuron was normalized to the response of task C for increasing responses and to that of task A for decreasing responses. I fitted a single straight line *Y* = β(*X* –1) +1 to both the increasing and decreasing responses of each monkey as described below, where *X, Y,* and β were the normalized state values, normalized responses, and coefficients, respectively. The mean square error of the data from the straight line was minimized over the pairs of values *d* and β. From this, we obtain a single value of *d* for each monkey; this indicates the monkey’s concern regarding the future reward. To fit a single line to both the increasing and decreasing responses, I regarded the decreasing responses as the increasing responses in a reverse order of the state values. In other words, I hypothetically regarded the decreasing responses at state values (1 + *d* + *d*^{2} + *d*^{3} + *d*^{4} + *d*^{5})/6, (1 + 5*d*)/6 and 1 as responses at state values 1, (1 + *d* + *d*^{2} + *d*^{3} + *d*^{4} + *d*^{5})/6 + (1 –(1 + 5*d*)/6), and (1 + *d* + *d*^{2} + *d*^{3} + *d*^{4} + *d*^{5})/6, respectively. It follows that the reversed decreasing responses were 1 at the normalized state value, 1. I then fitted a straight line to the normalized responses from all neurons that correlated with the state value of each monkey. Using the *F* test, the null hypothesis that *Y* was independent of *X* was tested against the alternative that the straight line *Y* = β(*X* –1) +1 fitted the data.

## RESULTS

### Behavioral performance

Figure 2*A* shows the success rate of trials in task A. Both the monkeys completed most of the trials to receive a reward whenever they made their first selection (92 and 76%). *Monkey A* completed the trials at nearly the same success rate, irrespective of the number of choices necessary to receive the reward. The success rate of *monkey B* decreased slightly with the number of choices.

Figure 2*B* shows the monkeys’ performance in terms of selecting the informative target as the first choice in task B. Both the monkeys almost always selected the informative target as the first choice, irrespective of the dot that was the informative target (96 and 89% of the trials). This indicates that they were seeking information regarding the reward target.

In task C, the monkeys almost always chose the reward target at the first saccade (99% of the trials in the case of both the monkeys).

### Neural correlates of information measure

I recorded the activity of 1,832 randomly selected neurons in the PMd of the two monkeys (Fig. 3) and analyzed the neural activity that was recorded during the first cross fixation. The neural activity was stored in the database if the fixation was followed by saccades to the informative target in task B or by saccades to the dot at the same position as the informative target in tasks A and C. Thus the activity shared an identical visual stimulus configuration and was followed by the same motor response. The amount of information expected from the subsequent eye movement was the highest in task B and the least in task C. Forty-six percent of the neurons showed significant differences in the activity between some of the tasks (844/1832), and 13% of the neurons showed significant differences between all the tasks (110/844). Of the remaining 110 neurons, 50% (55 neurons) showed an activity that reflected the expected amount of information. Figure 4*A* shows the representative data that was obtained from an information-measuring neuron, which exhibited the greatest activity during task B and the least activity during task C. This activity cannot be explained by sensory input or the preparation of motor response because these were identical for all the tasks. Thus the activity may reflect the amount of information that the monkeys expected to obtain from subsequent eye movements. Of the 55 neurons, 26 showed a significantly greater activity in task B than in task A and a significantly greater activity in task A than in task C. The remaining 29 neurons exhibited the opposite pattern: the least activity in task B and the greatest activity in task C.

Another possible explanation for the data was that the neural activity reflected the probability that the choice after the first one was the reward target, which was the greatest for task B and the least for task C. This explanation was rejected for the following reasons: the abovementioned probability was 0 both at the time of making the sixth choice in task A and the first choice in task C. If the explanation had been correct, the neural activity during the sixth cross fixation in task A should have been equal to that during the first cross fixation in task C and significantly different from that during the first cross fixation in task A. However, this was not the case for any information-measuring neuron (of the 55 neurons, 20 contradicted both the conditions and 35 contradicted only the latter condition). The neuron shown in Fig. 4*A* contradicted both the conditions: its activity during the sixth cross fixation in task A was significantly different from that during the first cross fixation in task C but was not different from that during the first cross fixation in task A. These findings support the possibility that the activity was able to reflect the expected amount of information.

To evaluate the information-related activity across the population, I normalized the activity of each information-measuring neuron and plotted its activity as a function of the information-theoretic measure (Fig. 4*C*). The *F* test suggests that the change in the population responses could be proportional to the information-theoretic measure [variance ratios were 16.6 (*P* < 0.01) and 53.0 (*P* < 0.01) for the increasing and decreasing responses of *monkey A* and 6.72 (*P* < 0.025) and 72.8 (*P* < 0.01) for the increasing and decreasing responses of *monkey B,* respectively] and such changes between tasks were significant. Such proportionality may reflect the intuition that the amount of information is additive between stochastically independent events.

The *F* test also suggests proportionality to the alternative measures of information: in the case of probability gain, the variance ratios were 8.67 (*P* < 0.01) and 52.9 (*P* < 0.01) for the increasing and decreasing responses for *monkey A* and 3.11 (not significant) and 29.8 (*P* < 0.01) for the increasing and decreasing responses for *monkey B,* respectively. In the case of impact, the variance ratios were 36.6 (*P* < 0.01) and 49.5 (*P* < 0.01) for the increasing and decreasing responses for *monkey A* and 16.2 (*P* < 0.01) and 79.2 (*P* < 0.01) for the increasing and decreasing responses for *monkey B,* respectively. These variance ratios suggest that all population responses were proportional to any of the three information measures except for the increasing response of *monkey B* that could have been uncorrelated with the probability gain. The variance ratios were smaller in the probability gain than in the other measures except for the decreasing response of *monkey A*. This suggests that the three information measures showed relatively similar proportionality to the neural responses although the probability gain was slightly inferior to the others.

The information-theoretic measure satisfies the following two conditions for the information measure: decreasing with probability and additive between independent events. The question that then arises is whether or not the alternative measures satisfy these conditions. It is shown that the probability gain decreases with probability but is not additive and the impact neither decreases with probability nor is additive (appendix).

These considerations suggest that in comparison with the other measures, the information-theoretic measure is more plausible for information measure in the brain, although the three information measures showed similar proportionality to the neural response.

### Motor control by information-measuring neurons and the time course of population response

To examine whether the information-measuring neurons participated in the motor selection process, I analyzed the neural activity with changes in the location of the informative target in task B. Approximately half of the information-measuring neurons (30/55 or 54%) exhibited significant changes in the activity during the first fixation. The neuron shown in Fig. 4*A* was more active when the lower right dot was informative than when the top dot was informative [tasks B and B(t)]. These activities shared the same visual stimulus configuration and information value expected from subsequent eye movements but differed in terms of subsequent eye movements. Therefore the activities may reflect the preparation for those eye movements. This suggests that the information-measuring neuron might also be involved in the selection of motor response based on expected information.

The time course of population responses showed initial differences between the tasks, followed by a gradual increase in the great responses (Fig. 4, *D* and *E*). This suggests that the monkeys were aware of the task type at the beginning of the trials and defined their expected information value based on this awareness.

### Neurons sensitive to reward proximity

The 110 neurons that exhibited significant changes in activity between all the tasks included 43 neurons (39%) of another type the activity of which reflected proximity to the reward. Figure 4*B* illustrates an example of these reward-proximity-coding neurons, which were the least active in task A and the most active in task C. The expected number of eye movements required to receive the reward was the greatest in task A, whereas only one eye movement was required to receive the reward in task C. This suggests that these neurons may encode proximity to the reward. Of these 43 neurons, 33 were the most active in task C and the least active in task A. The remaining 10 neurons showed the opposite pattern, that is, they were the most active in task A and the least active in task C.

I fitted the linear function of the state value to the normalized activity of the reward-proximity-coding neurons. The values of *d* minimizing the mean square error were 0.742 for *monkey A* and 0.515 for *monkey B*. Because value *d* indicates the subject’s concern regarding the future reward, the obtained values suggest that *monkey A* was more concerned regarding the future reward than *monkey B.* This might cause *monkey A* to continue task A longer than *monkey B* (Fig. 2*A*). The variance ratios were 45.4 (*P* < 0.01) for *monkey A* and 177 (*P* < 0.01) for *monkey B*, suggesting that the state value well explains the responses of the reward-proximity-coding neurons.

### Distribution of task-related neurons

The question whether the present data are a result of random activity leads to six possible permutations for the three tasks. The information-measuring neurons increased their activity in the order of tasks C, A, and B or B, A, and C. The reward-proximity-coding neurons increased their activity in the order of tasks A, B, and C, or C, B, and A. If the data were a result of random activity, then the information-measuring neurons, the reward-proximity-coding neurons, and the other neurons would have been present in an equal proportion among the 110 neurons. The distribution of the three neuron types deviated widely from equal proportions (50, 39, and 11%, respectively), suggesting that the present data were not the result of random activity.

## DISCUSSION

The PMd receives neural connections from areas that encode information related to probability, such as the ventral midbrain and the lateral intraparietal area, as mentioned in introduction. The PMd may be a part of the neural circuitry that computes the information value derived from the signals from those areas. Studies have shown that the PMd is involved in visually guided motor selection (di Pellegrino and Wise 1993; Hoshi and Tanji 2000; Rizzolatti et al. 1998). The PMd has also been shown to be involved in oculomotor control (Fujii et al. 2000). The present data suggest that the PMd may also participate in the selection of informative action from among visual targets.

The neural activity related to reward proximity has previously been observed in the anterior cingulate cortex (Shidara and Richmond 2002) and the caudate nucleus (Kawagoe et al. 1998). The present neural activity related to reward proximity indicates that the PMd also plays a role in the calculation of reward proximity.

The present study has shown that the brain may measure information using information entropy. The ability to measure information is essential to seek more information. The information-theoretic measure has been shown to serve human category learning (Corter and Gluck 1992) and the representation of human knowledge in computers (Nakamura et al. 1983; Quinlan 1983). A mechanism calculating the information entropy could contribute to the general information-seeking activity.

## APPENDIX

### Intuitive conditions for information measures

I examined whether the alternative measures satisfied the two conditions for the information measure: decreasing with probability and additive between independent events. For the previously mentioned example of identifying a playing card, the probability gain of the number on the card is obtained as follows: *Q* is what is the number on the card. *q* and *h* are 1,…, 13. *P*(*q*) is 1 for *q* that is the number on the card and 0 otherwise. *P*(*h*|*q*) is 1 for *h* that is the number on the card and 0 otherwise in the case of *q* that is the number on the card. *P*(*h*) is 1/13 for any *h*. Consequently, Pg(*Q*) = (1)max[1, 0,…, 0] –max[1/13,…, 1/13] = 12/13. Similarly, the probability gains of the suit and the card name (for example, “heart seven”) are 3/4 and 51/52, respectively; 12/13 > 3/4 and 12/13 + 3/4 is not equal to 51/52. It follows that the probability gain decreases with probability but is not additive.

The impact of the number on the card is (1)(abs[1 –1/13] + (12)abs[0 –1/13])/13 = (2)(12)/13^{2} = 24/169. Similarly, the impacts of the suit and card name are 6/16 and 102/2,704, respectively; 24/169 < 6/16 and 24/169 + 6/16 is not equal to 102/2,704. It follows that impact neither decreases with probability nor is additive.

Decrement in variance satisfies the two conditions for information measure. However, variance is not available unless a statistical variable is provided. As seen in the example of card identification, humans estimate the amount of information even in the absence of a statistical variable. In the present tasks, no statistical variable was provided, suggesting that the brain did not use variance for information measure.

## GRANTS

This work was supported by the Ministry of Education, Culture, Sports, Science and Technology of Japan.

## Acknowledgments

The author is grateful to E. Miyashita for technical advice, S. Watanabe for advice on data analysis, H. Itoh for valuable comments on the manuscript, M. Komatsu for technical assistance, and K. Matsuda for providing the eye-tracking system.

## Footnotes

The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “

*advertisement*” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

- Copyright © 2006 by the American Physiological Society