I would like to briefly explore the question of whether our aesthetic experience is in response to mental representations per se or, rather, to the physical stimuli correlated with those mental representations (in which case, the stimuli are responsible—or are the external-to-body beginnings, to be bit more precise—for both for the aesthetic experience and the mental representation). Perhaps aesthetic experience isn’t possible without some cultural, historical, personal (e.g., nostalgia), biological (e.g., when a guitar mimics human sobbing) association—in other words, without meaning. I’ll get to that in a moment. But for now I’ll think in terms of aesthetic experience without such associations. I’ll use music as a reference point.
The usual intuition is that our aesthetic responses to music are in response to the way music sounds; that is, are in response to the experience of music.1
An illustration will clarify what I’m getting at. When you strike the A4 key on a piano, two events take place. One event is a physical process: air molecules and other materials vibrate at 440 Hz per second, resulting in your neural machinery firing in sympathy with that frequency (I’ll use note to refer to a frequency in this sort of context; so the note A4 refers to the frequency 440 Hz in a musical or similar context); the other event is mental: you have an experience, the content of which is—or, perhaps better put, the identity of which is—the pitch A (notice that I use pitch to denote the mental event correlated with the note A; also notice that the pitch A is a term we use to refer to an experience possessing a certain quality: a quality present whether the correlated note’s source is an oboe, human voice, singing bird, and so on).
Our intuition is that our aesthetic experience results from the qualities residing in the mental events; that is, the qualities of those pitches and the structures they form—i.e., harmonies, melodies, and other musical mental events. In short: We have, it seems, an experience of beautiful music because the music-oriented mental events themselves are beautiful.
Intuitive, yes. But this strikes me as quite probably wrong. I have previously observed (in my piece “Music and Emotion,” which could likely use an update), and many have observed before me (see Arnold Schoenberg’s unabridged Theory of Harmony for one of many examples) that the physical properties of notes affect our experience of music. This may seem obvious, but it’s worth looking at more closely. For example, when you sing a C, what is produced is a stacked series of frequencies called the overtone (or harmonic) series: C, C, G, C, E, G… The frequencies weaken as you move up the stack, and begin to go slightly out of tune. The critical point to notice here is that the first several notes spell out a C major chord. This might help explain, then, why the C minor chord, which has the notes C-Eb-G, produces dissonance in one’s experience; i.e., given the conflict produced with the E overtone in the note C. This effect will differ depending on how how or low the notes in question are (for instance, musicians refer to this effect in the lower register of the piano—where frequencies are slower and the overtones are more pronounced—as muddy, particularly when the notes of the chord are played close together; you clean out the mud by spacing the notes further apart or by arpeggiating).
(NB: I use conflict not to suggest that these natural physical phenomena have any interest in one another, any more than a strong wind has interest in the trees it’s uprooting; rather, I’m using a metaphor that humans can easily understand to characterize what happens when two notes are physically quite similar, but not quite the same [i.e., what we call being “close together”]. When notes are close, the physical result is a kind of integral incompatibility [as with the wind and the tree], resulting in dissonance in our experience for reasons that I won’t explore here; instead I’ll just characterize this physical process—both among the stimuli themselves and within the physical brain as it responds to those stimuli—as conflict.)
When we hear a minor chord, we call it “dark” or “ominous” or “spooky” or “sad,” depending on context or on how dissonant that chord sounds. A more complicated minor chord is the C minor-major 7, with notes C-Eb-G-B, which introduces a striking new point of conflict: the B conflicts with the C; notice also that the chord is further complicated by producing a major third (or minor sixth) between the B and the Eb (which is enharmonic with D#), thus imposing into the mix the ethereal sounding Eb augmented chord; the stuff of horror films, play with the notes here: http://www.apronus.com/music/flashpiano.htm.
Here I’ll point out one more critical distinction. There is a difference between music sounding sad on the one hand, and making a listener actually feel sad on the other. (Note that I am still avoiding associations. Of course if your deceased lover’s favorite song will make you feel something that’s not due to the music itself.) It’s easy to make music sound sad; pretty much anyone can learn to the formulas to do this. Creating music makes people feel sad, however, is a mystery that comes with no formulas. An inspired musician can produce a happy sounding piece of music that makes listeners feel sad. The former is a kind of superficial or basic aesthetic experience (BAE); the latter is a rich or deep one (DAE). The latter is rarer by far.
I assume that what I’m proposing is applicable to both BAE and DAE, but it may only be applicable to the former. I doubt it, though for now that’s as far as I’m willing to stake my claim, which is the following.
Our intuition is that a particular bit of music strikes us as being beautiful, or gives us the feeling of having experienced beauty, because that music sounds beautiful. (Likewise, replace beauty with any other aesthetically oriented adjective.) There’s a mistake here. The sound of the music just is the experience. In other words, a pitch is an experience, not some object in your head that you then listen to with your mind’s ear. As I noted above, the pitch is a mental event that correlates with a physical event (i.e., a frequency or note). You experience the note as a pitch and it stops there, rather than experiencing the note as a pitch and then experiencing the pitch as itself a distinct object of perception.
Put in yet other terms, it naturally seems to us that the pitches are in the world—are coming out of the piano and into our heads—so that we may then listen to those pitches and deem them beautiful or not; that, however, is unintelligible. Namely, the notion of a mind’s ear leads to the classic recursive homunculus problem: the mind’s ear would need to experience (by means of its own brain-mind) the pitch it’s listening to—which is to say the pitch per se now functions as a stimulus for the mind’s ear—as some sort of new mental event correlated with the pitch, which presents the same problem all over again. This makes no sense, because there is no mind’s ear inside your head to “hear” the pitch as such (i.e., as a mental event); you don’t need to be on board with epiphenomenalism or some such—i.e., to hold some view that holds pitches and other experienced mental content to have no causal influence on brain states—to agree that the pitch per se cannot function as a physical stimulus for some higher order auditory faculty buried deep in the brain (i.e., some “mind’s ear”).
Instead, I propose the following. What determines our BAE (at least; I’ll briefly consider DAE below) is our experience of the notes, but not as pitches; rather, we experience them roughly simultaneously in some interoceptive or even affective capacity, and then characterize that experience (e.g., sad or happy) depending on context. (This may find a more thorough explanation in cognitive science work being done on embodiment and prediction, but I won’t go that deep into this here.) That is, when some music is played, some physical events happens, including stuff in our nervous system. Two sorts of experiences emerge from events. One is our response to the effects of the physical event on our body (which includes the brain): it feels good or bad, for example (putting it over-simply). Call this a bodily experience. The other is an auditory experience (e.g., pitches, chords, and other musical sounds). We then correlate the good or bad bodily experience with the auditory experience and in doing so mistakenly assign a kind of causal relation in which the musical sound causes the positive feeling. What really happened, though, is that the physical events that happened when the music was played are source of both the bodily experience and auditory experience.
(N.B.: I’m using the terms bodily and auditory to refer to pick out the differing natures of two different experiences happening in a single body at the roughly the same time. So, they are both technically bodily experiences. But they are also certainly distinguishable by kind.)
A great way to test this, though I don’t know if it’s possible, would be to turn off a subject’s auditory perception while allowing bodily experience to function freely. This doesn’t mean closing the ears. Everything would have to happen that always happens in the brain when the subject listens to music, minus the perception. A hardcore materialist need not think this impossible given the phenomenon of blindsight.
In summary once more: I’m proposing that the pleasant or unpleasant (broadly speaking) experience of music amounts to a bodily response to the physical events associated with music creation—usually a subtler version of something like the pain from smashing one’s finger with a hammer; but there’s this other set of mental events that comes along with that bodily response: the pitches, chords, and so on. We mistakenly concoct a story in which our bodily response is is to those pitches, but in reality the bodily response (or bodily experience) and the pitches (i.e., auditory experience) are both independently in response to the physical events associated with musical creation.
I am particularly willing to stake this claim in BAE territory. DAE is more complex, but I will only address it briefly. For example, memory, though seemingly required to have any sort of intelligible experience at all, likely plays a much bigger role in DAE. One of those roles, I suppose, is to facilitate a song’s growing on me. In that case, it may be that it’s not that the physical events of creating music are producing a new response (i.e., now that a song has grown on me), so much as there has developed a kind of meta-cognitive awareness of how the mental events resulting from that song relate to one another. By meta-cognitive, I mean my observation of the mental events (the pitches, chords, etc.) going on in my mind. So, I experience myself having that particular set of experiences related to that music (i.e., I experience myself experiencing), and then that triggers a new response (i.e., I like the song, wherein before I was neutral or maybe even didn’t like it). Put simpler: In the case of a song growing on me, I have a richer experience of the song because, for example, I experience the pitches happening early in the song in the context of what I know will happen later in the song; in other words, I’m experiencing the whole song rather than just disparate parts.
This account might seem to threaten a severe restriction on my basic thesis. I don’t think it does. There would still exist the BAEs in response to the disparate parts that make up the song; even if these are subtle, they would be significant. Furthermore, the thesis might well account for DAE to some significant degree after all. If, for example, it turns out that, thanks to memory, one is able to experience a larger musical event in the same way one experiences smaller musical events. To elaborate: There are pitches, timbres, melodies, harmonies, rhythms, and so on. A short melody with a few notes is experienced not only as those things, but, more importantly, as a kind of self-contained event—i.e., as a set of relations (e.g., pitch intervals)—both bodily and in terms of auditory experience. But piece made up of several long melodies consisting each of several notes and several key changes and so on, will be harder to experience as a single musical event—i.e., as a set of ordered relations whose middle, beginning, and end cohere into a single event—until it has been heard several times.2Finally, perhaps this goes yet deeper, and there is a story to tell about how, when a song grows on me, I might also be remembering how the physical events make me feel bodily, and it is the piecing together of the relations of those bodily events that accounts for the new experience of the song, rather than relation between the auditory events. To think otherwise would again be to mistakingly attributing my overall experience of the song to the way the song sounds.
Perhaps this story works. But at some point, either beyond wherever we have crossed over into DAE or precisely at that crossover point, our experience of a song will be enriched or deepened by entirely non-musical associations. This too may of course involve memory (which may be unconscious; conditioning, for instance, certainly doesn’t require that one remembers the conditioning process!), and in general will come down to some meta-cognitive processe’s involvement. Again, I use the term meta-cognitive fairly loosely: i.e., to refer to something like experiencing oneself having an experience.
These non-musical associations may be cultural (we recognize a certain kind of style of music, as a basic example) or personal (that style of music was my mother’s favorite) or it may be more mysterious than this, as a result of a lifelong series of accumulating experiences that can’t be summarized with words, but can be tapped into with music. The difference between my physical makeup (which, for whatever reason, responds pleasantly to dissonance, like a deep itch beings scratched in my mind-brain) and accumulated experiences can go a long way, I take it, in explaining both why we can agree that a series of minor chords sounds sad, i.e., produce a BAE of a certain sort, but disagree about whether they provoke a DAE of a certain sort (or at all).
As I said, I won’t push my thesis too far into DAE territory right now, but I will say that it strikes me as quite plausible that all of DAE can also be claimed under the thesis (e.g., the plasticity of the brain and broader bodily development as a result of effects of environment [including diet, social relations, climate, the sorts of aesthetically oriented stimuli experienced (e.g., differing tuning systems) and on and on] could account for differences between individual DAE sensitivities, so that the bodily responses correlated with certain auditory experiences will take on a special flavor for a given individual, and can even vary for a given individual at different times).
Until I encounter a better story, I’ll take it that BAE is accounted for by the physical stuff that happens when music is being played, and DAE, at its deepest is a result of non-musical associations. Both BAE and DAE may be mistakenly attributed to the experience of music as such, except in cases when we absolutely know that a piece of music has taken on an extra-musical dimension (e.g., “that’s my ex-lover’s favorite song”).
I’m going to summarize in yet another way, in case I haven’t been clear enough so far:
Our aesthetic experience is always EITHER a bodily response to the physical stuff that happens when people do musical things like expelling air through their vocal cords, striking piano keys, and so on, OR to the meaning (e.g., cultural, personal) we (usually unconsciously) assign to the mental events correlated with those kinds of music-making activities, RATHER THAN ever being to, as we mistakenly suppose, those musically oriented mental events in themselves as things of beauty, ugliness, and such.
A crude example might be a note played extremely loudly. (Crude, given that the pain it involves takes us beyond an aesthetic version of pain [e.g., an aesthetic response to the sound of a guitar imitating sobbing] into that of the real [e.g., an empathetic response to actual sobbing]; this crudeness makes the properties in question easy to distinguish, however.) Any physical damage—e.g., to hearing—caused here is not due to any mental representation, but to the physical amplitude of the note. Notice that amplitude itself has mental representation distinct from that of the pitch (a pitch can sound quiet or loud, and can even be perceive as loud when the volume of a speaker is turned very low or when heard from a distance), but damage is clearly not caused by how loud it sounds. If you hallucinate an extremely loud sounding pitch, I presume this won’t cause hearing damage. (though I can imagine someone mis-characterizing as hallucination loud sounds that may accompany damage to one’s hearing, such as with tinnitus). Notice also that this suggests questions about what happens physiologically when we hallucinate a beautiful melody, as well as whether the hallucination could be correlated with a robust aesthetic experience. If such a hallucination is possible, my claims would predict a robust degree of embodied simulation—that is, the same sorts of changes and effects in the nervous system that one would expect to see when the melody is being heard rather than hallucinated; we would also expect to see a milder form of this when actively imagining such a melody (in which case the aesthetic effects would also be milder if at all present: which is why most composers would prefer to hear their music performed rather than merely imagine it, no matter how vivid their imagination).
At this point, the story threatens to become quite complicated. The pitch happen at precisely the same time with some brain state (some might say the pitch and the physical brain state are identical). This brain state should be present with the hallucination; this accounts for what I above refer to as auditory experience. Any accompanying aesthetic experience will also need to be accounted for with some brain state—what I above refer to as a bodily experience. If hallucination is accompanied by the aesthetic experience, this could mean that the same brain state is correlated (i.e., is the physical substrate for) both the bodily and the auditory experience, or that, somehow, the bodily-experience brain state follows from the auditory-experience brain state. If hallucination is not accompanied by aesthetic experience, it would then seem that the bodily experience is directly grounded the external physical stimuli. Perhaps inducing hallucination (or even experimenting with lucid dreaming) could provide some answers, but I fear this will involve a lot of speculation. For example, if it’s the hallucination of a melody one has heard many times, an aesthetic response could be conditioned and thus follow from the hallucination. And the melody is unfamiliar to the subject and there is no aesthetic response, it could be because the music simply failed to inspire the subject’s tastes. I think lucid dreaming would pose even more problems, as dreams can be emotional places. Perhaps the best bet would be an expert meditator who can turn off certain parts of the brain.
I’ll leave this line of thinking with the following point. It may seem unlikely that hallucinated music would fail to produce BAE: a minor chord should sound dark whether heard or hallucinated. But I’m not so sure. A testament to the importance of actual physical stimuli is in the difficulties one encounters with hearing the dissonance between two pitches when they are isolated in left and right ears. For example, it is difficult to tune two differing pitches, so that they perfectly match, when one pitch is only heard by the left ear and the other only by the right ear. The physical conflict doesn’t emerge. I’ve encountered this problem when mixing music, and is, I suspect, part of why the conventional wisdom among mixing engineers is that you get better results with mixing through speakers rather than through earphones.
I would also suppose aesthetic experience to be just one sort of response one may have in this context. A variety of responses may be had to smells; textures; nonmusical sounds; colors and shapes that come together to form photographs, paintings or three-dimensional objects (e.g., people, our own bodies). Music struck me as a particularly easy point of exploration.
Finally, I think these questions are interesting given their implications for making sense of what we mean when talking about qualia. Daniel Dennett, in his 2014 book Intuition Pumps and Other Tools for Thinking, he presents the made-up case (that inspired this post) of Mr. Clapgras (a play on Capgras delusion, a disorder in which a someone believes a loved one has been replaced by a duplicate impostor). Clapgras woke up one morning with no outward changes in his color perception—he can correctly identify colors just as he’s always been able to; but, his emotional responses to colors has been inverted:
Color combinations he used to rate as pleasing he now rates as jarring, while finding the combinations of their “opposites” pleasing, and so forth. The shade of shocking pink that used to set his pulse racing he still identifies as shocking pink (though he marvels that anybody could call that shade of pink shocking), but it is now calming to him, while its complement, a shade of lime green that used to be calming, now excites him. (pp. 307–308, Kindle Edition)
Dennett then asks whether Clapgras’ qualia have been inverted. I won’t attempt to answer this question here, nor to propose a universal definition of quale (I’m not sure we need one), nor to do justice to Dennett’s exploration of his question. Instead, I’ll conclude by citing three more passages that emphasize the relevance to my thoughts here, which I think are the beginnings of thinking about how to approach Dennett’s concerns, whatever terminology we attach to things (a single term such as quale may prove insufficient; but this strikes me as a semantic rather than substantive representation of the concerns at hand). Namely, it doesn’t seem problematic to me that we have something along the lines of interoceptive or affective responses (or representations)—or some combination thereof that we may in their totality characterize as aesthetic—to stimuli at the same time as we have mental representations of those stimuli (e.g., actual percepts, which is to say the images and sounds we are given by our visual and auditory faculties, distinct from how those things make us feel; for sound, these are the most basic instances of what I refer to above as auditory experiences). We (unconsciously) notice a correlation between these two sorts of representation, but perhaps the tightness of that correlation leads us to incorrectly infer a causal relation. But it’s not the percepts that cause the representations of the aesthetic kind, but rather a third thing: the stimuli themselves.
Here three passages from Dennett (I recommend the entirety of the relevant chapters, of course, to get a better sense of what he’s getting at):
What I want to know is simply how philosophers mean to use the word “qualia”—do they identify all changes in subjective response as changes in qualia, or is there some privileged subset of responses that in effect anchor the qualia? Is the idea of changing one’s aesthetic opinion about—or response to—a particular quale nonsense or not? Until one makes decisions about such questions of definition, the term is not just vague or fuzzy; it is hopelessly ambiguous, equivocating between two (or more) fundamentally different ideas. (p. 310)
Consider this: when we looked at poor Mr. Clapgras, we saw that something was seriously amiss with him, but there seemed to be two importantly different ways of putting his plight: A. His aesthetic and emotional reactions to his color qualia had all been inverted (while his qualia remained constant). B. His color qualia had been inverted, even though his competence to distinguish, identify, and name colors had been preserved. (p. 319)
Dennett then summarizes two standard lines of argument that show the failure of A and B, after which he notes that a better line of reasoning is that “the qualia discussed in A and B aren’t doing any work”:
In both A and B, we see that the discrimination machinery [Note: this is the ability to name colors, which a computer could do] is working just as before, while Clapgras’s reactions to the deliverances of that machinery are inverted. The qualia are interposed as a sort of hard-to-pin-down intermediary that is imagined to provide the basis or raw material or ground of the emotional reactions, and there seem then to be two places where the inversion could happen: before the qualia are “presented” to the appreciation machinery, or after “presentation,” in the way the appreciation machinery responds to those presented qualia. This is one presentation process too many. We know, for instance, that negative (alarming, fear-inducing) reactions can be triggered quite early in the perceptual process, and they then “color” all subsequent processing of that perceptual input, in which case we could say that the emotional reactions cause the qualia to have the subjective character they had for Clapgras, rather than (vice versa) that the “intrinsic” nature of the qualia cause or ground the emotional reactions. But if we’ve already arrived at the emotional (or aesthetic or affective) reactions to the perceptual input, we have no more “work” for the qualia to do, and, of course, a zimbo [Note: a zimbo is Dennett’s version of a philosophical zombie.] would be just as bummed out by inverted reactions-to-perceptions as a conscious person is. (pp. 319–320)

Or click the banner to shop at Amazon (at no extra cost: it just gives me some of what would have gone to Amazon).
Footnotes:
- I use terms like aesthetic response and aesthetic experience interchangeably. Any instance of feeling (including conscious thinking) counts as an experience, which we may also call a mental representation. There are different sorts of experience, of course—aesthetic, affective (or emotional), visual (e.g., of the color red), and so on. I’ll try to be clear about what sorts of experiences I’m referring to as I go along. I trust that these are the sorts of experiences had by anyone reading this, and thus nothing mysterious. Though at the same time I’m all too familiar with how difficult it is to get across what is meant in the first place by terms like experience, representation, mental event, phenomenology.
- Perhaps some music can never be experienced in this way. A piece in which a random note is sounded for two seconds once a week would probably not ever be experienced as a single event. An aleatoric piece (i.e., one whose notes are chose randomly, for example by rolling die) may have no internal, “musical” logic to ground it, but it does have an extra-musical logic that grounds it. For example, John Cage’s 1951 “Music of Changes” for piano, composed by Cage using the I Ching to make musical decisions: https://www.youtube.com/watch?v=B_8-B2rNw7s. The perceived randomness and the constancy of the piano and the temporal framing of the stream of sounds etc. make it recognizable as a single piece of music and, I dare say, as a single event; teasing at and testing the integrity etc. of those sorts of boundaries—what are limits of what we can experience as a single, meaningful musical event?—is part of what this music’s about; but there is also the accumulated effect on one’s aesthetic organs, as it were, of this and other music we might call auto-contextual. To put it simply, there are some days when this music scratches an itch in by brain-mind that otherwise cannot be reached. This is distinct from whatever intellectual pleasures the stories surrounding such music may bring (or fail to bring; in the case of John Cage, his motivations are usually more interesting and inspiring to me than the music they result in: I’ve thoroughly enjoyed every interview with him I’ve ever read). At any rate, Cage’s music doesn’t grow you, or at least doesn’t seem to be designed to. 4’33”, for example, is most effective if you don’t have a clue as to what’s about to (not) happen (which is why you can’t really capture with still visual art the anticipation and suspense 4’33” must have inspired for unsuspecting audiences: you don’t stare at a blank canvas waiting for paint to appear)—and I recall Cage himself saying not long after its debut that he was finished with the silent piece, though I don’t recall where I read it.