Goto

Collaborating Authors

 open source code



A code

Neural Information Processing Systems

This section is meant to give an overview of our opensource code. Together with this git repo, we include a'tutorial colab' - a Jupyter notebooks that can be run in the browser without requiring any local installation at We view this open-source effort as a major contribution of our paper. We present the testbed pseudocode in this section. Recall from Section 3.1 that we We now describe the other parameters we use in the Testbed. In this section, we describe the benchmark agents in Section 3.3 and the choice of various Step 3: compute likelihoods for n = 1, 2, . . .


Europeans Scramble In AI Race

International Business Times

Generative AI chatbots unveiled by US tech firms have captivated the world with their spectacular successes and failures in engaging in conversations. But European firms focusing more on business applications are confident they won't be left in the dust in the rapidly developing field, even as they redouble their efforts. "The launch of ChatGPT has changed everything. It has been a wake-up call for European firms," said Laurent Daudet at French startup LightOn. "But the battle for generative AI isn't over," he added.


The (ab)use of Open Source Code to Train Large Language Models

Al-Kaswan, Ali, Izadi, Maliheh

arXiv.org Artificial Intelligence

In recent years, Large Language Models (LLMs) have gained significant popularity due to their ability to generate human-like text and their potential applications in various fields, such as Software Engineering. LLMs for Code are commonly trained on large unsanitized corpora of source code scraped from the Internet. The content of these datasets is memorized and emitted by the models, often in a verbatim manner. In this work, we will discuss the security, privacy, and licensing implications of memorization. We argue why the use of copyleft code to train LLMs is a legal and ethical dilemma. Finally, we provide four actionable recommendations to address this issue.


This lawsuit against Microsoft could change the future of AI

#artificialintelligence

Artificial intelligence (AI) is suddenly the darling of the tech world, thanks to ChatGPT, an AI chatbot that can do things such as carry on conversations and write essays and articles with what some people believe is human-like skill. In its first five days, more than a million people signed up to try it. The New York Times hails its "brilliance and weirdness" and says it inspires both awe and fear. For all the glitz and hype surrounding ChatGPT, what it's doing now are essentially stunts -- a way to get as much attention as possible. The future of AI isn't in writing articles about Beyoncé in the style of Charles Dickens, or any of the other oddball things people use ChatGPT for. Instead, AI will be primarily a business tool, reaping billions of dollars for companies that use it for tasks like improving internet searches, writing software code, discovering and fixing inefficiencies in a company's business, and extracting useful, actionable information from massive amounts of data.


Are We Ready for AI-Generated Code?

#artificialintelligence

In recent months, we've marveled at the quality of computer-generated faces, cat pictures, videos, essays, and even art. Artificial intelligence (AI) and machine learning (ML) have also quietly slipped into software development, with tools like GitHub Copilot, Tabnine, Polycode, and others taking the logical next step of putting existing code autocomplete functionality on AI steroids. Unlike cat pics, though, the origin, quality, and security of application code can have wide-reaching implications -- and at least for security, research shows that the risk is real. Prior academic research has already shown that GitHub Copilot often generates code with security vulnerabilities. More recently, hands-on analysis from Invicti security engineer Kadir Arslan showed that insecure code suggestions are still the rule rather than the exception with Copilot.


GitHub - deepmind/alphafold: Open source code for AlphaFold.

#artificialintelligence

This package provides an implementation of the inference pipeline of AlphaFold v2.0. This is a completely new model that was entered in CASP14 and published in Nature. For simplicity, we refer to this model as AlphaFold throughout the rest of this document. Any publication that discloses findings arising from using this source code or the model parameters should cite the AlphaFold paper. Please also refer to the Supplementary Information for a detailed description of the method.


GitHub's Commercial AI Tool Was Built From Open Source Code

#artificialintelligence

Earlier this month, Armin Ronacher, a prominent open-source developer, was experimenting with a new code-generating tool from GitHub called Copilot when it began to produce a curiously familiar stretch of code. The lines, drawn from the source code of the 1999 video game Quake III, are infamous among programmers--a combo of little tricks that add up to some pretty basic math, imprecisely. The original Quake coders knew they were hacking. "What the fuck," one commented in the code beside an especially egregious shortcut. So it was strange for Ronacher to see such code generated by Copilot, an artificial intelligence tool that is marketed to generate code that is both novel and efficient.


Snyk raises $150 million at $1 billion valuation for AI that protects open source code

#artificialintelligence

Snyk, a cybersecurity platform that helps developers find vulnerabilities in their open source applications, has raised $150 million in a round of funding led by New York-based private equity firm Stripes, with participation from Salesforce Ventures, Coatue, Tiger Global, BoldStart, Trend Forward, and Amity. This takes Snyk's total funding to $250 million from backers including Alphabet's GV and Accel, including a $22 million series B round in 2018 and a $70 million follow-on round just a few months ago. A Snyk spokesperson said that the company is now worth more than $1 billion, which is at least double the $500 million it was valued at back in September. Founded in 2015, London-based Snyk targets developers -- rather than cybersecurity personnel -- to help them find and fix flaws in their source code, as well as their containers and Kubernetes applications. The developer connects Snyk to a code repository in the likes of GitHub, GitLab, or Bitbucket, and Snyk then scans for vulnerabilities (or license violations), providing a description of the problem, noting where the flaw lies in the code, issuing a severity rating, and even suggesting a fix.


The Top GitHub Repositories & Reddit Threads Every Data Scientist should know (June 2018) - Analytics Vidhya

#artificialintelligence

Half the year has flown by and that brings us to the June edition of our popular series – the top GitHub repositories and Reddit threads from last month. During the course of writing these articles, I have learned so much about machine learning from either open source codes or invaluable discussions among the top data science brains in the world. What makes GitHub special is not just it's code hosting and social collaboration features for data scientists. It has lowered the entry barrier into the open source world and has played a MASSIVE role in spreading knowledge and expanding the machine learning community. We saw some amazing open source code being released in June.