Science News | Slashdot

Fedora 41 Finally Retires Python 2.7 (fedoraproject.org) 25

Posted by EditorDavid on Sunday July 07, 2024 @10:34AM from the lived-long-and-prospered dept.

'How Good Is ChatGPT at Coding, Really?' (ieee.org) 135

Posted by EditorDavid on Saturday July 06, 2024 @06:44PM from the opening-AI dept.

IEEE Spectrum (the IEEE's official publication) asks the question. "How does an AI code generator compare to a human programmer?" A study published in the June issue of IEEE Transactions on Software Engineering evaluated the code produced by OpenAI's ChatGPT in terms of functionality, complexity and security. The results show that ChatGPT has an extremely broad range of success when it comes to producing functional code — with a success rate ranging from anywhere as poor as 0.66 percent and as good as 89 percent — depending on the difficulty of the task, the programming language, and a number of other factors. While in some cases the AI generator could produce better code than humans, the analysis also reveals some security concerns with AI-generated code.
The study tested GPT-3.5 on 728 coding problems from the LeetCode testing platform — and in five programming languages: C, C++, Java, JavaScript, and Python. The results? Overall, ChatGPT was fairly good at solving problems in the different coding languages — but especially when attempting to solve coding problems that existed on LeetCode before 2021. For instance, it was able to produce functional code for easy, medium, and hard problems with success rates of about 89, 71, and 40 percent, respectively. "However, when it comes to the algorithm problems after 2021, ChatGPT's ability to generate functionally correct code is affected. It sometimes fails to understand the meaning of questions, even for easy level problems," said Yutian Tang, a lecturer at the University of Glasgow. For example, ChatGPT's ability to produce functional code for "easy" coding problems dropped from 89 percent to 52 percent after 2021. And its ability to generate functional code for "hard" problems dropped from 40 percent to 0.66 percent after this time as well...

The researchers also explored the ability of ChatGPT to fix its own coding errors after receiving feedback from LeetCode. They randomly selected 50 coding scenarios where ChatGPT initially generated incorrect coding, either because it didn't understand the content or problem at hand. While ChatGPT was good at fixing compiling errors, it generally was not good at correcting its own mistakes... The researchers also found that ChatGPT-generated code did have a fair amount of vulnerabilities, such as a missing null test, but many of these were easily fixable.
"Interestingly, ChatGPT is able to generate code with smaller runtime and memory overheads than at least 50 percent of human solutions to the same LeetCode problems..."

AI Researcher Warns Data Science Could Face a Reproducibility Crisis (beabytes.com) 56

Posted by EditorDavid on Sunday June 16, 2024 @08:16PM from the back-off-I'm-a-scientist dept.

Long-time Slashdot reader theodp shared this warning from a long-time AI researcher arguing that data science "is due" for a reckoning over whether results can be reproduced. "Few technological revolutions came with such a low barrier of entry as Machine Learning..." Unlike Machine Learning, Data Science is not an academic discipline, with its own set of algorithms and methods... There is an immense diversity, but also disparities in skill, expertise, and knowledge among Data Scientists... In practice, depending on their backgrounds, data scientists may have large knowledge gaps in computer science, software engineering, theory of computation, and even statistics in the context of machine learning, despite those topics being fundamental to any ML project. But it's ok, because you can just call the API, and Python is easy to learn. Right...?

Building products using Machine Learning and data is still difficult. The tooling infrastructure is still very immature and the non-standard combination of data and software creates unforeseen challenges for engineering teams. But in my views, a lot of the failures come from this explosive cocktail of ritualistic Machine Learning:

- Weak software engineering knowledge and practices compounded by the tools themselves;
- Knowledge gap in mathematical, statistical, and computational methods, encouraged black boxing API;
- Ill-defined range of competence for the role of data scientist, reinforced by a pool of candidates with an unusually wide range of backgrounds;
- A tendency to follow the hype rather than the science.

- What can you do?

- Hold your data scientists accountable using Science.
- At a minimum, any AI/ML project should include an Exploratory Data Analysis, whose results directly support the design choices for feature engineering and model selection.
- Data scientists should be encouraged to think outside-of-the box of ML, which is a very small box - Data scientists should be trained to use eXplainable AI methods to provide context about the algorithm's performance beyond the traditional performance metrics like accuracy, FPR, or FNR.
- Data scientists should be held at similar standards than other software engineering specialties, with code review, code documentation, and architectural designs.
The article concludes, "Until such practices are established as the norm, I'll remain skeptical of Data Science."

Is C++ More Popular Than C? 142

Posted by EditorDavid on Saturday June 15, 2024 @10:24PM from the C-you-later dept.

Python 'Language Summit' 2024: Security Workflows, Calendar Versioning, Transforms and Lightning Talks (blogspot.com) 19

Posted by EditorDavid on Saturday June 15, 2024 @10:04AM from the enhancement-proposals dept.

Friday the Python Software Foundation published several blog posts about this year's "Python Language Summit" May 15th (before PyCon US), which featured talks and discussions by core developers, triagers, and Python implementation maintainers.

There were several lightning talks. One talk came from the maintainer of the PyO3 project, offering Rust bindings for the Python C API (which requires mapping Rust concepts to Python — leaving a question as to how to map Rust's error-handling panic! macro). There was a talk on formalizing the PEP prototype process, and a talk on whether the Python team should have a more official presence in the Apple App Store (and maybe the Google Play Store). One talk suggested changing the formatting of error messages for assert statements, and one covered a "highly experimental" project to support structured data sharing between Python subinterpreters. One talk covered Python's "unsupported build" warning and how it should behave on platforms beyond Python's officially supported list.

Python Foundation blog posts also covered some of the longer talks, including one on the idea of using type annotations as a mechanism for transformers. One talk covered the new interactive REPL interpreter coming to Python 3.13.

And one talk focused on Python's security model after the xz-utils backdoor: Pablo Galindo Salgado, Steering Council member and the release manager for Python 3.10 and 3.11, brought this topic to the Language Summit to discuss what could be done to improve Python's security model... Pablo noted the similarities shared between CPython and xz-utils, referencing the previous Language Summit's talk on core developer burnout, the number of modules in the standard library that have one or zero maintainers, the high ratio of maintainers to source code, and the use of autotools for configuration. Autotools was used by [xz's] Jia Tan as part of the backdoor, specifically to obscure the changes to tainted release artifacts. Pablo confirmed along with many nods of agreement that indeed, CPython could be vulnerable to a contributor or core developer getting secretly malicious changes merged into the project.

For multiple reasons like being able to fix bugs and single-maintainer modules, CPython doesn't require reviewers on the pull requests of core developers. This can lead to "unilateral action", meaning that a change is introduced into CPython without the review of someone besides the author. Other situations like release managers backporting fixes to other branches without review are common.
Much discussion ensued about the possibility of altering workflows (including pull request reviews), identity verification, and the importance of post-incident action plans. Guido van Rossum suggested a "higher bar" for granting write access, but in the end "Overall it was clear there is more discussion and work to be done in this rapidly changing area."

In another talk, Hugo van Kemenade, the newly announced Release Manager for Python 3.14 and 3.15, "started the Language Summit with a proposal to change Python's versioning scheme. The perception of Python using semantic versioning is a source of confusion for users who don't expect backwards incompatible changes when upgrading to new versions of Python. In reality almost all new feature releases of Python include backwards incompatible changes such as the removal of "dead batteries" where PEP 594 marked 19 modules for removal in Python 3.13. Calendar Versioning (CalVer) encompasses a wide array of different versioning schemes that have one property in common: using the release date as part of a release's version... Hugo offered multiple proposed versioning schemes, including:

- Using the release year as minor version (3.YY.micro, "3.26.0")
- Using the release year as major version (YY.0.micro, "26.0.0")
- Using the release year and month as major and minor version (YY.MM.micro, "26.10.0")

[...] Overall the proposal to use the current year as the minor version was well-received, Hugo mentioned that he'd be drafting up a PEP for this change.

Rust Growing Fastest, But JavaScript Reigns Supreme (thenewstack.io) 55

Posted by EditorDavid on Saturday June 08, 2024 @03:34PM from the race-conditions dept.

"Rust is the fastest-growing programming language, with its developer community doubling in size over the past two years," writes The New Stack, "yet JavaScript remains the most popular language with 25.2 million active developers, according to the results of a recent survey." The 26th edition of SlashData's Developer Nation survey showed that the Rust community doubled its number of users over the past two years — from two million in the first quarter of 2022 to four million in the first quarter of 2024 — and by 33% in the last 12 months alone. The SlashData report covers the first quarter of 2024. "Rust has developed a passionate community that advocates for it as a memory-safe language which can provide great performance, but cybersecurity concerns may lead to an even greater increase," the report said. "The USA and its international partners have made the case in the last six months for adopting memory-safe languages...."

"JavaScript's dominant position is unlikely to change anytime soon, with its developer population increasing by 4M developers over the last 12 months, with a growth rate in line with the global developer population growth," the report said. The strength of the JavaScript community is fueled by the widespread use of the language across all types of development projects, with at least 25% of developers in every project type using it, the report said. "Even in development areas not commonly associated with the language, such as on-device coding for IoT projects, JavaScript still sees considerable adoption," SlashData said.

Also, coming in strong, Python has overtaken Java as the second most popular language, driven by the interest in machine learning and AI. The battle between Python and Java shows Python with 18.2 million developers in Q1 2024 compared to Java's 17.7 million. This comes about after Python added more than 2.1 million net new developers to its community over the last 12 months, compared to Java which only increased by 1.2 million developers... Following behind Java there is a six-million-developer gap to the next largest community, which is C++ with 11.4 million developers, closely trailed by C# with 10.2 million and PHP with 9.8 million. Languages with the smallest communities include Objective-C with 2.7 million developers, Ruby with 2.5 million, and Lua with 1.8 million. Meanwhile, the Go language saw its developer population grow by 10% over the last year. It had previously outpaced the global developer population growth, growing by 5Y% over the past two years, from three million in Q1 2022 to 4.7 million in Q1 2024.
"TNS analyst Lawrence Hecht has a few different takeaways. He notes that with the exceptions of Rust, Go and JavaScript, the other major programming languages all grew slower than the total developer population, which SlashData says increased 39% over the last two years alone."

Cybercriminal Posed as 'Helpful' Stack Overflow User To Recommend Malware Hosted on PyPi (bleepingcomputer.com) 43

Posted by EditorDavid on Monday June 03, 2024 @03:34AM from the worst-answer dept.

Mistral Releases Codestral, Its First Generative AI Model For Code (techcrunch.com) 27

Posted by msmash on Wednesday May 29, 2024 @02:00PM from the aggressive-expansion dept.

Mojo, Bend, and the Rise of AI-First Programming Languages (venturebeat.com) 26

Posted by EditorDavid on Sunday May 26, 2024 @04:53PM from the Mojo-rising dept.

"While general-purpose languages like Python, C++, and Java remain popular in AI development," writes VentureBeat, "the resurgence of AI-first languages signifies a recognition that AI's unique demands require specialized languages tailored to the domain's specific needs... designed from the ground up to address the specific needs of AI development." Bend, created by Higher Order Company, aims to provide a flexible and intuitive programming model for AI, with features like automatic differentiation and seamless integration with popular AI frameworks. Mojo, developed by Modular AI, focuses on high performance, scalability, and ease of use for building and deploying AI applications. Swift for TensorFlow, an extension of the Swift programming language, combines the high-level syntax and ease of use of Swift with the power of TensorFlow's machine learning capabilities...

At the heart of Mojo's design is its focus on seamless integration with AI hardware, such as GPUs running CUDA and other accelerators. Mojo enables developers to harness the full potential of specialized AI hardware without getting bogged down in low-level details. One of Mojo's key advantages is its interoperability with the existing Python ecosystem. Unlike languages like Rust, Zig or Nim, which can have steep learning curves, Mojo allows developers to write code that seamlessly integrates with Python libraries and frameworks. Developers can continue to use their favorite Python tools and packages while benefiting from Mojo's performance enhancements... It supports static typing, which can help catch errors early in development and enable more efficient compilation... Mojo also incorporates an ownership system and borrow checker similar to Rust, ensuring memory safety and preventing common programming errors. Additionally, Mojo offers memory management with pointers, giving developers fine-grained control over memory allocation and deallocation...

Mojo is conceptually lower-level than some other emerging AI languages like Bend, which compiles modern high-level language features to native multithreading on Apple Silicon or NVIDIA GPUs. Mojo offers fine-grained control over parallelism, making it particularly well-suited for hand-coding modern neural network accelerations. By providing developers with direct control over the mapping of computations onto the hardware, Mojo enables the creation of highly optimized AI implementations.

According to Mojo's creator, Modular, the language has already garnered an impressive user base of over 175,000 developers and 50,000 organizations since it was made generally available last August. Despite its impressive performance and potential, Mojo's adoption might have stalled initially due to its proprietary status. However, Modular recently decided to open-source Mojo's core components under a customized version of the Apache 2 license. This move will likely accelerate Mojo's adoption and foster a more vibrant ecosystem of collaboration and innovation, similar to how open source has been a key factor in the success of languages like Python.

Developers can now explore Mojo's inner workings, contribute to its development, and learn from its implementation. This collaborative approach will likely lead to faster bug fixes, performance improvements and the addition of new features, ultimately making Mojo more versatile and powerful.
The article also notes other languages "trying to become the go-to choice for AI development" by providing high-performance execution on parallel hardware. Unlike low-level beasts like CUDA and Metal, Bend feels more like Python and Haskell, offering fast object allocations, higher-order functions with full closure support, unrestricted recursion and even continuations. It runs on massively parallel hardware like GPUs, delivering near-linear speedup based on core count with zero explicit parallel annotations — no thread spawning, no locks, mutexes or atomics. Powered by the HVM2 runtime, Bend exploits parallelism wherever it can, making it the Swiss Army knife for AI — a tool for every occasion...

The resurgence of AI-focused programming languages like Mojo, Bend, Swift for TensorFlow, JAX and others marks the beginning of a new era in AI development. As the demand for more efficient, expressive, and hardware-optimized tools grows, we expect to see a proliferation of languages and frameworks that cater specifically to the unique needs of AI. These languages will leverage modern programming paradigms, strong type systems, and deep integration with specialized hardware to enable developers to build more sophisticated AI applications with unprecedented performance. The rise of AI-focused languages will likely spur a new wave of innovation in the interplay between AI, language design and hardware development. As language designers work closely with AI researchers and hardware vendors to optimize performance and expressiveness, we will likely see the emergence of novel architectures and accelerators designed with these languages and AI workloads in mind. This close relationship between AI, language, and hardware will be crucial in unlocking the full potential of artificial intelligence, enabling breakthroughs in fields like autonomous systems, natural language processing, computer vision, and more.

The future of AI development and computing itself are being reshaped by the languages and tools we create today.
In 2017 Modular AI's founder Chris Lattner (creator of the Swift and LLVM) answered questions from Slashdot readers.

FORTRAN and COBOL Re-enter TIOBE's Ranking of Programming Language Popularity (i-programmer.info) 93

Posted by EditorDavid on Sunday May 19, 2024 @06:01PM from the popularity-contests dept.

"The TIOBE Index sets out to reflect the relative popularity of computer languages," writes i-Programmer, "so it comes as something of a surprise to see two languages dating from the 1950's in this month's Top 20. Having broken into the the Top 20 in April 2021 Fortran has continued to rise and has now risen to it's highest ever position at #10... The headline for this month's report by Paul Jansen on the TIOBE index is:

Fortran in the top 10, what is going on?

Jansen's explanation points to the fact that there are more than 1,000 hits on Amazon for "Fortran Programming" while languages such as Kotlin and Rust, barely hit 300 books for the same search query. He also explains that Fortran is still evolving with the new ISO Fortran 2023 definition published less than half a year ago....

The other legacy language that is on the rise in the TIOBE index is COBOL. We noticed it re-enter the Top 20 in January 2024 and, having dropped out in the interim, it is there again this month.
More details from TechRepublic: Along with Fortran holding on to its spot in the rankings, there were a few small changes in the top 10. Go gained 0.61 percentage points year over year, rising from tenth place in May 2023 to eighth this year. C++ rose slightly in popularity year over year, from fourth place to third, while Java (-3.53%) and Visual Basic (-1.8) fell.
Here's how TIOBE ranked the 10 most popular programming languages in May:

Python
C
C++
Java
C#
JavaScript
Visual Basic
Go
SQL
Fortran

On the rival PYPL ranking of programming language popularity, Fortran does not appear anywhere in the top 29.

A note on its page explains that "Worldwide, Python is the most popular language, Rust grew the most in the last 5 years (2.1%) and Java lost the most (-4.0%)." Here's how it ranks the 10 most popular programming languages for May:

Python (28.98% share)
Java (15.97% share)
JavaScript (8.79%)
C# (6.78% share)
R (4.76% share)
PHP (4.55% share)
TypeScript (3.03% share)
Swift (2.76% share)
Rust (2.6% share)

Google Lays Off Hundreds of 'Core' Employees, Moves Some Positions To India and Mexico (cnbc.com) 80

Posted by BeauHD on Wednesday May 01, 2024 @09:25PM from the latest-layoffs dept.

Google Lays Off Staff From Flutter, Dart and Python Teams (techcrunch.com) 28

Posted by BeauHD on Monday April 29, 2024 @06:40PM from the budget-cuts dept.

Fake Job Interviews Target Developers With New Python Backdoor (bleepingcomputer.com) 16

Posted by BeauHD on Friday April 26, 2024 @08:02PM from the stay-vigilant dept.

GPT-4 Can Exploit Real Vulnerabilities By Reading Security Advisories 74

Posted by EditorDavid on Sunday April 21, 2024 @05:05PM from the AI-exploits dept.

Is PHP Declining In Popularity? (infoworld.com) 94

Posted by EditorDavid on Sunday April 14, 2024 @12:34PM from the protect-and-server dept.

Code.org Launches AI Teaching Assistant For Grades 6-10 In Stanford Partnership (illinois.edu) 16

Posted by BeauHD on Thursday April 11, 2024 @06:40PM from the in-classroom-support dept.

theodp writes: From a Wednesday press release: "Code.org, in collaboration with The Piech Lab at Stanford University, launched today its AI Teaching Assistant, ushering in a new era of computer science instruction to support teachers in preparing students with the foundational skills necessary to work, live and thrive in an AI world. [...] Launching as a part of Code.org's leading Computer Science Discoveries (CSD) curriculum [for grades 6-10], the tool is designed to bolster teacher confidence in teaching computer science." EdWeek reports that in a limited pilot project involving twenty teachers nationwide, the AI computer science grading tool cut one middle school teacher's grading time in half. Code.org is now inviting an additional 300 teachers to give the tool a try. "Many teachers who lead computer science courses," EdWeek notes, "don't have a degree in the subject -- or even much training on how to teach it -- and might be the only educator in their school leading a computer science course."

Stanford's Piech Lab is headed by assistant professor of CS Chris Piech, who also runs the wildly-successful free Code in Place MOOC (30,000+ learners and counting), which teaches fundamentals from Stanford's flagship introduction to Python course. Prior to coming up with the new AI teaching assistant, which automatically assesses Code.org students' JavaScript game code, Piech worked on a Stanford Research team that partnered with Code.org nearly a decade ago to create algorithms to generate hints for K-12 students trying to solve Code.org's Hour of Code block-based programming puzzles (2015 paper [PDF]). And several years ago, Piech's lab again teamed with Code.org on Play-to-Grade, which sought to "provide scalable automated grading on all types of coding assignments" by analyzing the game play of Code.org students' projects. Play-to-Grade, a 2022 paper (PDF) noted, was "supported in part by a Stanford Hoffman-Yee Human Centered AI grant" for AI tutors to help prepare students for the 21st century workforce. That project also aimed to develop a "Super Teaching Assistant" for Piech's Code in Place MOOC. LinkedIn co-founder Reid Hoffman, who was present for the presentation of the 'AI Tutors' work he and his wife funded, is a Code.org Diamond Supporter ($1+ million). In other AI grading news, Texas will use computers to grade written answers on this year's STAAR tests. The state will save more than $15 million by using technology similar to ChatGPT to give initial scores, reducing the number of human graders needed.

Rust, Python, Apache Foundations and Others Announce Big Collaboration on Cybersecurity Process Specifications (eclipse-foundation.blog) 42

Posted by EditorDavid on Sunday April 07, 2024 @05:26PM from the serious-on-security dept.

The foundations behind Rust, Python, Apache, Eclipse, PHP, OpenSSL, and Blender announced plans to create "common specifications for secure software development," based on "existing open source best practices."

From the Eclipse Foundation: This collaborative effort will be hosted at the Brussels-based Eclipse Foundation [an international non-profit association] under the auspices of the Eclipse Foundation Specification Process and a new working group... Other code-hosting open source foundations, SMEs, industry players, and researchers are invited to join in as well.

The starting point for this highly technical standardisation effort will be today's existing security policies and procedures of the respective open source foundations, and similar documents describing best practices.

The governance of the working group will follow the Eclipse Foundation's usual member-led model but will be augmented by explicit representation from the open source community to ensure diversity and balance in decision-making. The deliverables will consist of one or more process specifications made available under a liberal specification copyright licence and a royalty-free patent licence... While open source communities and foundations generally adhere to and have historically established industry best practices around security, their approaches often lack alignment and comprehensive documentation.

The open source community and the broader software industry now share a common challenge: legislation has introduced an urgent need for cybersecurity process standards.
The Apache Foundation notes the working group is forming partly "to demonstrate our commitment to cooperation with and implementation of" the EU's Cyber Resilience Act. But the Eclipse Foundation adds that even before it goes into effect in 2027, they're recognizing open source software's "increasingly vital role in modern society" and an increasing need for reliability, safety, and security, so new regulations like the CRA "underscore the urgency for secure by design and robust supply chain security standards."

Their announcement adds that "It is also important to note that it is similarly necessary that these standards be developed in a manner that also includes the requirements of proprietary software development, large enterprises, vertical industries, and small and medium enterprises." But at the same time, "Today's global software infrastructure is over 80% open source... [W]hen we discuss the 'software supply chain,' we are primarily, but not exclusively, referring to open source."

"We invite you to join our collaborative effort to create specifications for secure open source development," their announcement concludes," promising initiative updates on a new mailing list. "Contribute your ideas and participate in the magic that unfolds when open source foundations, SMEs, industry leaders, and researchers combine forces to tackle big challenges."

The Python Foundation's announcement calls it a "community-driven initiative" that will have "a lasting impact on the future of cybersecurity and our shared open source communities."

Roblox Executive Says Children Making Money On the Platform Isn't Exploitation, It's a Gift (eurogamer.net) 60

Posted by BeauHD on Thursday April 04, 2024 @08:02PM from the eyebrow-raising dept.

In an interview with Roblox Studio head Stefano Corazza, Eurogamer asked about the reputation Roblox has gained and the notion that it was exploitative of young developers, since it takes a cut from work sometimes produced by children. Here's what he had to say: "I don't know, you can say this for a lot of things, right?" Corazza said. "Like, you can say, 'Okay, we are exploiting, you know, child labour,' right? Or, you can say: we are offering people anywhere in the world the capability to get a job, and even like an income. So, I can be like 15 years old, in Indonesia, living in a slum, and then now, with just a laptop, I can create something, make money and then sustain my life. "There's always the flip side of that, when you go broad and democratized - and in this case, also with a younger audience," he continued. "I mean, our average game developer is in their 20s. But of course, there's people that are teenagers -- and we have hired some teenagers that had millions of players on the platform.

"For them, you know, hearing from their experience, they didn't feel like they were exploited! They felt like, 'Oh my god, this was the biggest gift, all of a sudden I could create something, I had millions of users, I made so much money I could retire.' So I focus more on the amount of money that we distribute every year to creators, which is now getting close to like a billion dollars, which is phenomenal."

At this point the PR present during the interview added that "the vast majority of people that are earning money on Roblox are over the age of 18." "And imagine like, the millions of kids that learn how to code every month," Corazza said. "We have millions of creators in Roblox Studio. They learn Lua scripting," a programming language, "which is pretty close to Python - you can get a job in the tech industry in the future, and be like, 'Hey, I'm a programmer,' right? "I think that we are really focusing on the learning - the curriculum, if you want - and really bringing people on and empowering them to be professionals."

AI Hallucinated a Dependency. So a Cybersecurity Researcher Built It as Proof-of-Concept Malware (theregister.com) 44

Posted by EditorDavid on Saturday March 30, 2024 @03:34PM from the this-package-doesn't-exist dept.

"Several big businesses have published source code that incorporates a software package previously hallucinated by generative AI," the Register reported Thursday

"Not only that but someone, having spotted this reoccurring hallucination, had turned that made-up dependency into a real one, which was subsequently downloaded and installed thousands of times by developers as a result of the AI's bad advice, we've learned." If the package was laced with actual malware, rather than being a benign test, the results could have been disastrous.

According to Bar Lanyado, security researcher at Lasso Security, one of the businesses fooled by AI into incorporating the package is Alibaba, which at the time of writing still includes a pip command to download the Python package huggingface-cli in its GraphTranslator installation instructions. There is a legit huggingface-cli, installed using pip install -U "huggingface_hub[cli]". But the huggingface-cli distributed via the Python Package Index (PyPI) and required by Alibaba's GraphTranslator — installed using pip install huggingface-cli — is fake, imagined by AI and turned real by Lanyado as an experiment.

He created huggingface-cli in December after seeing it repeatedly hallucinated by generative AI; by February this year, Alibaba was referring to it in GraphTranslator's README instructions rather than the real Hugging Face CLI tool... huggingface-cli received more than 15,000 authentic downloads in the three months it has been available... "In addition, we conducted a search on GitHub to determine whether this package was utilized within other companies' repositories," Lanyado said in the write-up for his experiment. "Our findings revealed that several large companies either use or recommend this package in their repositories...."

Lanyado also said that there was a Hugging Face-owned project that incorporated the fake huggingface-cli, but that was removed after he alerted the biz.
"With GPT-4, 24.2 percent of question responses produced hallucinated packages, of which 19.6 percent were repetitive, according to Lanyado..."

GitHub Introduces AI-Powered Tool That Suggests Ways It Can Auto-Fix Your Code (bleepingcomputer.com) 24

Posted by EditorDavid on Sunday March 24, 2024 @06:34PM from the found-means-fixed dept.

2014	U.S. Supreme Court Upholds Religious Objections To Contraception	1330 comments
2007	Google Protects Healthcare From Michael Moore	1153 comments
2005	Justice O'Connor Retiring	1157 comments
2004	Linux vs. Windows: What's The Difference?	1219 comments
2003	Bill Gates On Linux	1194 comments

Slashdot Top Deals