Close Menu
New York Examiner News

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Watch The Weeknd kick off European leg of ‘After Hours Til Dawn’ tour in Copenhagen

    June 22, 2026

    Dow futures drop and oil jumps as first day of US-Iran talks sees Trump threaten Tehran on Hormuz

    June 22, 2026

    Trump Obsesses Over Reflecting Pool Conspiracies As The Middle East Melts Down

    June 22, 2026
    Facebook X (Twitter) Instagram
    New York Examiner News
    • Home
    • US News
    • Politics
    • Business
    • Science
    • Technology
    • Lifestyle
    • Music
    • Television
    • Film
    • Books
    • Contact
      • About
      • Amazon Disclaimer
      • DMCA / Copyrights Disclaimer
      • Terms and Conditions
      • Privacy Policy
    New York Examiner News
    Home»Technology»Making AI models ‘forget’ undesirable data hurts their performance
    Technology

    Making AI models ‘forget’ undesirable data hurts their performance

    By July 29, 2024
    Facebook Twitter Pinterest LinkedIn WhatsApp Email Reddit Telegram
    Making AI models ‘forget’ undesirable data hurts their performance


    So-called “unlearning” techniques are used to make a generative AI model forget specific and undesirable info it picked up from training data, like sensitive private data or copyrighted material.

    But current unlearning techniques are a double-edged sword: They could make a model like OpenAI’s GPT-4o or Meta’s Llama 3.1 405B much less capable of answering basic questions.

    That’s according to a new study co-authored by researchers at the University of Washington (UW), Princeton, the University of Chicago, USC and Google, which found that the most popular unlearning techniques today tend to degrade models — often to the point where they’re unusable.

    “Our evaluation suggests that currently feasible unlearning methods are not yet ready for meaningful usage or deployment in real-world scenarios,” Weijia Shi, a researcher on the study and a Ph.D. candidate in computer science at UW, told TechCrunch. “Currently, there are no efficient methods that enable a model to forget specific data without considerable loss of utility.”

    How models learn

    Generative AI models have no real intelligence. They’re statistical systems that predict words, images, speech, music, videos and other data. Fed an enormous number of examples (e.g. movies, voice recordings, essays and so on), AI models learn how likely data is to occur based on patterns, including the context of any surrounding data.

    Given an email ending in the fragment “Looking forward…”, for example, a model trained to autocomplete messages might suggest “… to hearing back,” following the pattern of all the emails it’s ingested. There’s no intentionality there; the model isn’t looking forward to anything. It’s simply making an informed guess.

    Most models, including flagships like GPT-4o, are trained on data sourced from public websites and data sets around the web. Most vendors developing such models argue that fair use shields their practice of scraping data and using it for training without informing, compensating or even crediting the data’s owners.

    But not every copyright holder agrees. And many — from authors to publishers to record labels — have filed lawsuits against vendors to force a change.

    The copyright dilemma is one of the reasons unlearning techniques have gained a lot of attention lately. Google, in partnership with several academic institutions, last year launched a competition seeking to spur the creation of new unlearning approaches.

    Unlearning could also provide a way to remove sensitive info from existing models, like medical records or compromising photos, in response to a request or government order. (Thanks to the way they’re trained, models tend to sweep up lots of private information, from phone numbers to more problematic examples.) Over the past few years, some vendors have rolled out tools to allow data owners to ask that their data be removed from training sets. But these opt-out tools only apply to future models, not models trained before they rolled out; unlearning would be a much more thorough approach to data deletion.

    Regardless, unlearning isn’t as easy as hitting “Delete.”

    The art of forgetting

    Unlearning techniques today rely on algorithms designed to “steer” models away from the data to be unlearned. The idea is to influence the model’s predictions so that it never — or only very rarely — outputs certain data.

    To see how effective these unlearning algorithms could be, Shi and her collaborators devised a benchmark and selected eight different open algorithms to test. Called MUSE (Machine Unlearning Six-way Evaluation), the benchmark aims to probe an algorithm’s ability to not only prevent a model from spitting out training data verbatim (a phenomenon known as regurgitation), but eliminate the model’s knowledge of that data along with any evidence that it was originally trained on the data.

    Scoring well on MUSE requires making a model forget two things: books from the Harry Potter series and news articles.

    For example, given a snippet from Harry Potter and The Chamber of Secrets (“‘There’s more in the frying pan,’ said Aunt…”), MUSE tests whether an unlearned model can recite the whole sentence (“‘There’s more in the frying pan,’ said Aunt Petunia, turning eyes on her massive son”), answer questions about the scene (e.g. “What does Aunt Petunia tell her son?”, “More in the frying pan”) or otherwise indicate it’s been trained on text from the book.

    MUSE also tests whether the model retained related general knowledge — e.g. that J.K. Rowling is the author of the Harry Potter series — after unlearning, which the researchers refer to as the model’s overall utility. The lower the utility, the more related knowledge the model lost, making the model less able to correctly answer questions.

    In their study, the researchers found that the unlearning algorithms they tested did make models forget certain information. But they also hurt the models’ general question-answering capabilities, presenting a trade-off.

    “Designing effective unlearning methods for models is challenging because knowledge is intricately entangled in the model,” Shi explained. “For instance, a model may be trained on copyrighted material — Harry Potter books as well as on freely available content from the Harry Potter Wiki. When existing unlearning methods attempt to remove the copyrighted Harry Potter books, they significantly impact the model’s knowledge about the Harry Potter Wiki, too.”

    Are there any solutions to the problem? Not yet — and this highlights the need for additional research, Shi said.

    For now, vendors betting on unlearning as a solution to their training data woes appear to be out of luc. Perhaps a technical breakthrough will make unlearning feasible someday. But for the time being, vendors will have to find another way to prevent their models from saying things they shouldn’t.



    Original Source Link

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp Email Reddit Telegram
    Previous ArticleWhy Controlling Landfill Methane Is Key to Slowing Climate Change
    Next Article Biden-era deportations down sharply from Trump era, despite media and activist narrative: analysis

    RELATED POSTS

    Wooting 60HE v2: Peak Keyboard Perfection

    June 22, 2026

    Signal’s Meredith Whittaker wants you to remember that AI chatbots ‘are not your friends’

    June 21, 2026

    Home Batteries: How They’re Installed and How Much They Cost

    June 21, 2026

    He made your free video player run smoothly. Now he’s doing that for robots.

    June 20, 2026

    Gen Z Singles Are Trying to Make ‘Solomaxxing’ Aspirational

    June 20, 2026

    The CEO of Allbirds’ new AI biz has a plan, but no employees

    June 19, 2026
    latest posts

    Watch The Weeknd kick off European leg of ‘After Hours Til Dawn’ tour in Copenhagen

    The Weeknd kicked off the European leg of his ‘After Hours Til Dawn’ tour in Copenhagen…

    Dow futures drop and oil jumps as first day of US-Iran talks sees Trump threaten Tehran on Hormuz

    June 22, 2026

    Trump Obsesses Over Reflecting Pool Conspiracies As The Middle East Melts Down

    June 22, 2026

    Tim Howard says it’s ‘impossible’ for USMNT to win 2026 World Cup

    June 22, 2026

    Wooting 60HE v2: Peak Keyboard Perfection

    June 22, 2026

    We’ve found a mysterious substance on Titan and Pluto

    June 22, 2026

    Virginia Woolf’s Night & Day review – a muddled…

    June 22, 2026
    Categories
    • Books (1,320)
    • Business (6,222)
    • Events (58)
    • Film (6,160)
    • Lifestyle (4,234)
    • Music (6,280)
    • Politics (6,215)
    • Science (5,577)
    • Technology (6,156)
    • Television (5,847)
    • Uncategorized (7)
    • US News (6,211)
    popular posts

    Vangelis, Chariots Of Fire’ and ‘Blade Runner’ composer, has died

    Vangelis, who composed soundtracks for Chariots Of Fire and Blade Runner has died at 79 years…

    Four Interesting Facts About Wedding Ring On The Left Ring Finger – Ferbena.com

    March 26, 2023

    The Crip Cinema Archive wants to change the way we think about disability in film

    April 23, 2024

    Political parties urged to “seize the moment” as Music Venue Trust share manifesto for the grassroots ahead of election

    June 11, 2024
    Archives
    Browse By Category
    • Books (1,320)
    • Business (6,222)
    • Events (58)
    • Film (6,160)
    • Lifestyle (4,234)
    • Music (6,280)
    • Politics (6,215)
    • Science (5,577)
    • Technology (6,156)
    • Television (5,847)
    • Uncategorized (7)
    • US News (6,211)
    About Us

    We are a creativity led international team with a digital soul. Our work is a custom built by the storytellers and strategists with a flair for exploiting the latest advancements in media and technology.

    Most of all, we stand behind our ideas and believe in creativity as the most powerful force in business.

    What makes us Different

    We care. We collaborate. We do great work. And we do it with a smile, because we’re pretty damn excited to do what we do. If you would like details on what else we can do visit out Contact page.

    Our Picks

    We’ve found a mysterious substance on Titan and Pluto

    June 22, 2026

    Virginia Woolf’s Night & Day review – a muddled…

    June 22, 2026

    ‘Today’ Frances Rivera Says Goodbye After 12 Years

    June 22, 2026
    © 2026 New York Examiner News. All rights reserved. All articles, images, product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement unless specified. By using this site, you agree to the Terms & Conditions and Privacy Policy.

    Type above and press Enter to search. Press Esc to cancel.

    We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
    Cookie SettingsAccept All
    Manage consent

    Privacy Overview

    This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
    Necessary
    Always Enabled
    Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
    CookieDurationDescription
    cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
    cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
    cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
    cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
    cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
    viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
    Functional
    Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
    Performance
    Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
    Analytics
    Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
    Advertisement
    Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
    Others
    Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
    SAVE & ACCEPT