performance computing
Thus Spake Long-Context Large Language Model
Liu, Xiaoran, Li, Ruixiao, Huang, Mianqiu, Liu, Zhigeng, Song, Yuerong, Guo, Qipeng, He, Siyang, Wang, Qiqi, Li, Linlin, Liu, Qun, Zhou, Yaqian, Huang, Xuanjing, Qiu, Xipeng
Long context is an important topic in Natural Language Processing (NLP), running through the development of NLP architectures, and offers immense opportunities for Large Language Models (LLMs) giving LLMs the lifelong learning potential akin to humans. Unfortunately, the pursuit of a long context is accompanied by numerous obstacles. Nevertheless, long context remains a core competitive advantage for LLMs. In the past two years, the context length of LLMs has achieved a breakthrough extension to millions of tokens. Moreover, the research on long-context LLMs has expanded from length extrapolation to a comprehensive focus on architecture, infrastructure, training, and evaluation technologies. Inspired by the symphonic poem, Thus Spake Zarathustra, we draw an analogy between the journey of extending the context of LLM and the attempts of humans to transcend its mortality. In this survey, We will illustrate how LLM struggles between the tremendous need for a longer context and its equal need to accept the fact that it is ultimately finite. To achieve this, we give a global picture of the lifecycle of long-context LLMs from four perspectives: architecture, infrastructure, training, and evaluation, showcasing the full spectrum of long-context technologies. At the end of this survey, we will present 10 unanswered questions currently faced by long-context LLMs. We hope this survey can serve as a systematic introduction to the research on long-context LLMs.
Data Race Detection Using Large Language Models
Chen, Le, Ding, Xianzhong, Emani, Murali, Vanderbruggen, Tristan, Lin, Pei-hung, Liao, Chuanhua
Large language models (LLMs) are demonstrating significant promise as an alternate strategy to facilitate analyses and optimizations of high-performance computing programs, circumventing the need for resource-intensive manual tool creation. In this paper, we explore a novel LLM-based data race detection approach combining prompting engineering and fine-tuning techniques. We create a dedicated dataset named DRB-ML, which is derived from DataRaceBench, with fine-grain labels showing the presence of data race pairs and their associated variables, line numbers, and read/write information. DRB-ML is then used to evaluate representative LLMs and fine-tune open-source ones. Our experiment shows that LLMs can be a viable approach to data race detection. However, they still cannot compete with traditional data race detection tools when we need detailed information about variable pairs causing data races.
Knowledge Graphs 2.0: High Performance Computing Emerges - insideBIGDATA
They're the most effective means of preparing data for statistical AI, creditable knowledge graph platforms utilize supervised and unsupervised learning to accelerate numerous processes, and their smart inferences are a form of machine intelligence. Coupling knowledge graphs with high performance computing enables organizations to not only avail themselves of sophisticated techniques to optimize AI, but also employ it at the scale and speed of contemporary data demands. According to Katana Graph CEO Keshav Pingali, "There is a need for high performance graph computingā¦in two ways. One is the volume of data, and the other is time to insight." Scaling knowledge graphs with high performance computing is a means of rapidly analyzing the tremendous data quantities organizations routinely contend with for informed, low latent action across numerous use cases including "intrusion detection, fraud detection, and Anti-Money Laundering," Pingali noted.
High Performance Computing in the World of Artificial Intelligence - insideHPC
Thierry Pellegrino is vice president and general manager of Dell EMC High Performance Computing. In this special guest feature, Thierry Pellegrino from Dell EMC writes that data analytics powered by HPC & AI solutions are delivering new insights for research and the enterprise. For years, companies have been collecting data, but now, with the help of artificial intelligence (AI) and powerful analytics capabilities, they have the opportunity to get more out of it. However, with this opportunity, AI and analytics also have become big-data challenges that are changing how organizations and industries handle their data. Many organizations are turning to high-performance computing as the solution, resulting in a wave of new ways to leverage HPC, new skill set requirements and new approaches for an era defined by volumes of data.
Top 5 Deep Learning and AI Stories- June 1, 2018
Fusing high performance computing and AI 2. Find your next binge-worthy show with AI 3. The connection between self-driving vehicles and radiology 4. Robots are learning new tasks by mimicking humans 5. How AI could spot a silent cancer in time to save lives 5. FUSING HIGH PERFORMANCE COMPUTING AND AI During GTC Taiwan 2018, NVIDIA CEO Jensen Huang announced HGX-2: a "building block" cloud-server platform that will let server manufacturers create more powerful systems around NVIDIA GPUs for high performance computing and AI. TechCrunch's Ron Miller sums it up best, saying that: "It's the stuff that geek dreams are made of. READ ARTICLE 6. FIND YOUR NEXT BINGE-WORTHY SHOW WITH AI While AI may play a leading role in the entertainment industry's depictions of the future on screen, it's already starring in entertainment behind the scenes, thanks to Netflix. Our latest AI Podcast features the company's research and engineering director, Justin Basilico. LISTEN HERE 7. CONNECTING SELF-DRIVING VEHICLES AND RADIOLOGY According to new commentary published in the Journal of American College of Radiology, AI implementation may not be as far as people believe, as seen in self- driving vehicles. "It is important to realize that many of these features are not far-future applications in radiology but will be incorporated into routine clinical practice over the next few years." "In radiology, having AI perform the mundane tasks that humans may struggle with or find interminableā¦frees us to interact with the images in ways that can push the boundaries of diagnostic science.
Some Data Scientist New Year Resolutions for 2017
I've never been very big on New Year's resolutions. I've tried them in the past, and while they are nice to think about, they are always overly vague, difficult to accomplish in a year, trite, or just don't get done (or attempted). This year I decided to try something different instead of just not making resolutions at all. I set out some professional goals for myself as a Data Scientist. Open source software is only as good as its community and/or developer(s).