Galois Tech Talk: Requirements and Performance of Data Intensive, Irregular Applications
Requirements and Performance of Data Intensive, Irregular Applications
Dr. John Feo
Many fundamental science, national security, and business applications need to process large volumes of irregular, unstructured data. Data collection and analysis is rapidly changing the way the scientific, national security, and economic communities operate. There are worldwide operational deployments of instruments to detect the proliferation of weapons of mass destruction, monitor terrorist cells, and track the movement of illicit goods and services. In the next 15 years 30% of battle-space defense forces will be autonomous with each advanced robotic device carrying dozens of sophisticated sensors collecting, processing, analyzing and transmitting large amounts of data. American economic competitiveness will depend increasingly on the timely analysis of many Petabytes of data collected in diverse computing clouds charting the social and economic behavior of consumers.
Unlike traditional scientific applications based on linear algebra routines, data analytic applications comprise large, integer-based graph computations with irregular data access patterns, low computation to memory access ratios, and high levels of fine grain parallelism that pass data and synchronize frequently. Traditional architectures optimized to run large-scale floating point intensive simulations are inadequate, and more suitable high-end architectures such as the Cray XMT are needed. In this talk I will discuss the programming language, tools, and system requirements for data analytic applications. I will survey the research at PNNL’s Center for Adaptive Supercomputer Software as regards graph analytics. In particular, I will present several key graph algorithms we have developed with an emphasis on structure, use of special hardware features, performance, and scalability.