Researchers at an Israeli security firm revealed Tuesday how hackers can turn the “hallucinations” of a generational AI into a nightmare for an organization’s software supply chain.
In a blog post on the Vulkan Cyber website, researchers Bar Lanyado, Ortel Keizman, and Yair Divinsky explained how they exploited false information generated by ChatGPT about open-source software packages to distribute malicious code into a development environment. May go.
He explained that he has seen ChatGPT generating URLs, references, and even code libraries and functions that do not actually exist.
If ChatGPT is building code libraries or packages, attackers could use these hallucinations to spread malicious packages through suspicious and previously undetectable techniques such as typosquatting or masquerading, he noted.
If an attacker could create a package to replace the “fake” packages recommended by ChatGPT, the researchers continued, they might be able to download the victim and use it.
That scenario is becoming increasingly likely to occur, he maintained, as more and more developers migrate from traditional online search domains for code solutions like Stack Overflow to AI solutions like ChatGPT.
already generating malicious packages
Daniel Kennedy, research director of information security and networking at 451 Research, part of S&P Global Markets, said, “The authors predict that as generative AI becomes more popular, it will begin to receive developer questions that once went to Stack Overflow. ” Intelligence, a global market research company.
“The answers to those questions generated by AI may not be correct or may refer to packages that no longer exist or may never have existed,” he told TechNewsWorld. “A bad actor seeing that could create a code package in that name to contain malicious code and is consistently recommended to developers by generator AI tools.”
“Vulkan’s researchers took it a step further by prioritizing the FAQ on Stack Overflow as they would put the AI, and see that packages that don’t exist were recommended,” he said.
According to the researchers, they queried Stack Overflow to get the most common questions asked about more than 40 topics, and used the first 100 questions for each topic.
Then, they asked ChatGPT, via its API, all the questions they had collected. They used the API to replicate an attacker’s approach to obtain as many non-existent package recommendations as possible in the shortest amount of time.
In each answer, he looked for a pattern in the package installation command and extracted the recommended package. They then checked to see if the recommended package was present. If it did not, he tried to publish it himself.
cluing software
Malicious packages generated with code from ChatGPT have already been observed on the package installers PyPI and NPM, said Henrik Platt, a security researcher at Endor Labs, a dependency management company in Palo Alto, California.
“Large language models can also aid attackers in building malware variants that implement the same logic but have different forms and structures, for example, by distributing malicious code across different functions, changing identifiers, By creating fake comments and dead code or similar technologies,” he told TechNewsworld.
The problem with software today is that it is not written independently, observed Ira Winkler, chief information security officer at CYE, a global provider of automated software security technologies.
“It’s basically a lot of software already cobbled together,” he told TechNewsWorld. “It’s very efficient, so a developer doesn’t have to write a simple function from scratch.”
However, this can result in developers importing code without properly fixing it.
Joseph Harush, head of software supply chain security at Checkmarks, an application security company in Tel Aviv, Israel, said, “Users of ChatGPT are receiving instructions to install open-source software packages that, while legitimate, could install a malicious package.” Are.”
“In general,” he told TechNewsWorld, “a culture of copy-paste-exec is dangerous. Doing this blindly from sources like ChatGPT can lead to supply chain attacks, as the Vulkan research team has demonstrated.”
know your code sources
Melissa Bischopping, director of endpoint security research at Tanium, a converged endpoint management provider in Kirkland, Wash., also warned about lax use of third-party code.
“You should never download and execute code that you don’t understand and haven’t tested by grabbing it from a random source – like the open source GitHub repo or now ChatGPT recommendations,” he told TechNewsWorld.
“Any code you intend to run should be assessed for security, and you should have private copies of it,” he advised. “Don’t import directly from public repositories, such as those used in the Vulkan attack.”
She said that attacking supply chains through shared or imported third-party libraries is not new.
“This strategy will continue to be used,” he warned, “and the best defense is to employ secure coding practices and thoroughly test and review code – especially code developed by third parties – intended for use in production environments.” Is.”
“Don’t blindly trust every library or package you find on the Internet or in a chat with an AI,” he cautioned.
Know the source of your code, said Dan Lorenc, CEO and co-founder of ChainGuard, a maker of software supply chain security solutions in Seattle.
“Developer authenticity, verified through signed commits and packages, and obtaining open source artifacts from a source or vendor you can trust, is the only real long-term prevention against these Sybil-style attacks,” he told TechNewsWorld. There are mechanisms.”
opening innings
The authentication code, however, isn’t always simple, said Bud Broomhead, CEO of Wayaku, a developer of cyber and physical security software solutions in Mountain View, Calif.
“In many types of digital assets – and especially in IoT/OT devices – firmware still lacks digital signatures or other forms of establishing trust, which makes exploitation possible,” he told TechNewsWorld.
“We are in the early innings of generative AI being used for both cybercrime and defense. Credit to Vulkan and other organizations who are using language learning models to spot new threats in a timely manner and prevent this type of exploitation. are being tuned towards,” he said.
“Remember,” he continued, “it was only a few months ago that I could tell Chat GPT to create a new piece of malware, and it would. Now it took very specific and directed guidance to create it unintentionally.” And hopefully that approach too will soon be supplanted by AI engines.