Although the existence of correlated spiking between neurons in a population is well known, the role such correlations play in encoding stimuli is not. We address this question by constructing pattern-based encoding models that describe how time-varying stimulus drive modulates the expression probabilities of population-wide spike patterns. The challenge is that large populations may express an astronomical number of unique patterns, and so fitting a unique encoding model for each individual pattern is not feasible. We avoid this combinatorial problem using a dimensionality-reduction approach based on regression trees. Using the insight that some patterns may, from the perspective of encoding, be statistically indistinguishable, the tree divisively clusters the observed patterns into groups whose member patterns possess similar encoding properties. These groups, corresponding to the leaves of the tree, are much smaller in number than the original patterns, and the tree itself constitutes a tractable encoding model for each pattern. Our formalism can detect an extremely weak stimulus-driven pattern structure and is based on maximizing the data likelihood, not making a priori assumptions as to how patterns should be grouped. Most important, by comparing pattern encodings with independent neuron encodings, one can determine if neurons in the population are driven independently or collectively. We demonstrate this method using multiple unit recordings from area 17 of anesthetized cat in response to a sinusoidal grating and show that pattern-based encodings are superior to those of independent neuron models. The agnostic nature of our clustering approach allows us to investigate encoding by the collective statistics that are actually present rather than those (such as pairwise) that might be presumed.