Applying educational data mining and learning analytics approaches to data collected from games and simulations for learning is an emergent practice. The CREATE Lab and the Games for Learning Institute have identified three different types of datasets of interest to researchers, including intensive lab data sets, extensive data sets, and intensive field data sets. Intensive data sets collected within the research lab setting yield deep yet narrow data sets with a high level of detail on a limited number of participants yielding insights into the types of interactions between variables and their impact on outcomes. Variables can include: learner engagement, learning/problem solving strategies, experienced emotions, stress, cognitive load, goal orientation, cognitive abilities, and learning outcomes. Extensive data sets are broad yet shallow, yielding a limited amount of data on a large number of participants yielding insights into the impact of specific design features on learning processes and outcomes. Intensive field data sets are deep yet narrow heterogenous data sets with rich contextual data for a limited number of participants yielding insights into real life use of simulations and games and the impact of the learning context on their effectiveness. Most importantly, these different types of data sets collected using the same simulation or game and the same population can be used to approach constructs in multiple different ways, allowing for enhanced analysis and cross validation capabilities. While digital environments make it possible to capture these data, this community seeks increased collaboration and community with data scientists to develop stronger analyses and inferences.