Close Menu
New York Examiner News

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    ‘Unexpected’ Hunter’s Dad Granted Custody, Texts With Falen

    April 24, 2026

    28 Affordable Mother’s Day Gifts Under $50 for Every Type of Mom

    April 24, 2026

    10 Hidden Indie Gems for Indie Bookstore Day

    April 24, 2026
    Facebook X (Twitter) Instagram
    New York Examiner News
    • Home
    • US News
    • Politics
    • Business
    • Science
    • Technology
    • Lifestyle
    • Music
    • Television
    • Film
    • Books
    • Contact
      • About
      • Amazon Disclaimer
      • DMCA / Copyrights Disclaimer
      • Terms and Conditions
      • Privacy Policy
    New York Examiner News
    Home»Science»Elon Musk’s New Grok 4 Takes on ‘Humanity’s Last Exam’ as the AI Race Heats Up
    Science

    Elon Musk’s New Grok 4 Takes on ‘Humanity’s Last Exam’ as the AI Race Heats Up

    By AdminJuly 12, 2025
    Facebook Twitter Pinterest LinkedIn WhatsApp Email Reddit Telegram
    Elon Musk’s New Grok 4 Takes on ‘Humanity’s Last Exam’ as the AI Race Heats Up


    New Grok 4 Takes on ‘Humanity’s Last Exam’ as the AI Race Heats Up

    Elon Musk has launched xAI’s Grok 4—calling it the “world’s smartest AI” and claiming it can ace Ph.D.-level exams and outpace rivals such as Google’s Gemini and OpenAI’s o3 on tough benchmarks

    By Deni Ellis Béchard edited by Dean Visser

    Digital illustration, structure made of cubes evolves from simple (on the left) to gradually a more complex shape of a thinking or contemplating person seated on a rock

    Elon Musk released the newest artificial intelligence model from his company xAI on Wednesday night. In an hour-long public reveal session, he called the model, Grok 4, “the smartest AI in the world” and claimed it was capable of getting perfect SAT scores and near-perfect GRE results in every subject, from the humanities to the sciences.

    During the online launch, Musk and members of his team described testing Grok 4 on a metric called Humanity’s Last Exam (HLE)—a 2,500-question benchmark designed to evaluate an AI’s academic knowledge and reasoning skill. Created by nearly 1,000 human experts across more than 100 disciplines and released in January 2025, the test spans topics from the classics to quantum chemistry and mixes text with images. Grok 4 reportedly scored 25.4 percent on its own. But given access to tools (such as external aids for code execution or Web searches), it hit 38.6 percent. That jumped to 44.4 percent with a version called Grok 4 Heavy, which uses multiple AI agents to solve problems. The two next best-performing AI models are Google’s Gemini-Pro (which achieved 26.9 percent with the tools) and OpenAI’s o3 model (which got 24.9 percent, also with the tools). The results from xAI’s internal testing have yet to appear on the leaderboard for HLE, however, and it remains unclear whether this is because xAI has yet to submit the results or because those results are pending review. Manifold, a social prediction market platform where users bet play money (called “Mana”) on future events in politics, technology and other subjects, predicted a 1 percent chance, as of Friday morning, that Grok 4 would debut on HLE’s leaderboard with a 45 percent score or greater on the exam within a month of its release. (Meanwhile xAI has claimed a score of only 44.4.)

    During the launch, the xAI team also ran live demonstrations showing Grok 4 crunching baseball odds, determining which xAI employee has the “weirdest” profile picture on X and generating a simulated visualization of a black hole. Musk suggested that the system may discover entirely new technologies by later this year—and possibly “new physics” by the end of next year. Games and movies are on the horizon, too, with Musk predicting that Grok 4 will be able to make playable titles and watchable films by 2026. Grok 4 also has new audio capabilities, including a voice that sang during the launch, and Musk said new image generation and coding tools are soon to be released. The regular version of Grok 4 costs $30 a month; SuperGrok Heavy—the deluxe package with multiple agents and research tools—runs at $300.


    On supporting science journalism

    If you’re enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


    Artificial Analysis, an independent benchmarking platform that ranks AI models, now lists Grok 4 as highest on its Artificial Analysis Intelligence Index, slightly ahead of Gemini 2.5 Pro and OpenAI’s o4-mini-high. And Grok 4 appears as the top-performing publicly available model on the leaderboards for the Abstraction and Reasoning Corpus, or ARC-AGI-1, and its second edition, ARC-AGI-2—benchmarks that measure progress toward “humanlike” general intelligence. Greg Kamradt, president of ARC Prize Foundation, a nonprofit organization that maintains the two leaderboards, says that when the xAI team contacted the foundation with Grok 4’s results, the organization then independently tested Grok 4 on a dataset to which the xAI team did not have access and confirmed the results. “Before we report performance for any lab, it’s not verified unless we verify it,” Kamradt says. “We approved the [testing results] slide that [the xAI team] showed in the launch.”

    According to xAI, Grok 4 also outstrips other AI systems on a number of additional benchmarks that suggest its strength in STEM subjects (read a full breakdown of the benchmarks here). Alex Olteanu, a senior data science editor at AI education platform DataCamp, has tested it. “Grok has been strong on math and programming in my tests, and I’ve been impressed by the quality of its chain-of-thought reasoning, which shows an ingenious and logically sound approach to problem-solving,” Olteanu says. “Its context window, however, isn’t very competitive, and it may struggle with large code bases like those you encounter in production. It also fell short when I asked it to analyze a 170-page PDF, likely due to its limited context window and weak multimodal abilities.” (Multimodal abilities refer to a model’s capacity to analyze more than one kind of data at the same time, such as a combination of text, images, audio and video.)

    On a more nuanced front, issues with Grok 4 have surfaced since its release. Several posters on X—owned by Musk himself—as well as tech-industry news outlets have reported that when Grok 4 was asked questions about the Israeli-Palestinian conflict, abortion and U.S. immigration law, it often searched for Musk’s stance on these issues by referencing his X posts and articles written about him. And the release of Grok 4 comes after several controversies with Grok 3, the previous model, which issued outputs that included antisemitic comments, praise for Hitler and claims of “white genocide”—incidents that xAI publicly acknowledged, attributing them to unauthorized manipulations and stating that the company was implementing corrective measures.

    At one point during the launch, Musk commented on how making an AI smarter than humans is frightening, though he said he believes the ultimate result will be good—probably. “I somewhat reconciled myself to the fact that, even if it wasn’t going to be good, I’d at least like to be alive to see it happen,” he said.



    Original Source Link

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp Email Reddit Telegram
    Previous ArticleThe Other Way Around review – a new type of…
    Next Article A United Nations research institute created an AI refugee avatar

    RELATED POSTS

    Is stem cell therapy about to transform medicine and reverse ageing?

    April 24, 2026

    A Startup Says It Grew Human Sperm in a Lab—and Used It to Make Embryos

    April 23, 2026

    Passage from Homer’s Iliad discovered in the abdomen of a Roman-era Egyptian mummy

    April 23, 2026

    98 per cent of meat and dairy sustainability pledges are greenwashing

    April 22, 2026

    New Gas-Powered Data Centers Could Emit More Greenhouse Gases Than Entire Nations

    April 22, 2026

    Hegseth says U.S. military no longer requires flu vaccination, drawing criticism from health experts

    April 21, 2026
    latest posts

    ‘Unexpected’ Hunter’s Dad Granted Custody, Texts With Falen

    Unexpected star Hunter Johnson’s father, Casey, has just been granted emergency custody of his fourteen-year-old…

    28 Affordable Mother’s Day Gifts Under $50 for Every Type of Mom

    April 24, 2026

    10 Hidden Indie Gems for Indie Bookstore Day

    April 24, 2026

    Nelly to Headline Trump-Hosted White House Correspondents’ Dinner Party

    April 24, 2026

    Upstart’s new millennial CEO thinks AI can make every American 10% richer

    April 24, 2026

    Hakeem Jeffries Calls Trump The Dumbest President In History

    April 24, 2026

    Giants’ bizarre draft decisions leave star player frustrated as true needs go unfulfilled in first round

    April 24, 2026
    Categories
    • Books (1,203)
    • Business (6,105)
    • Events (48)
    • Film (6,042)
    • Lifestyle (4,145)
    • Music (6,156)
    • Politics (6,104)
    • Science (5,459)
    • Technology (6,036)
    • Television (5,725)
    • Uncategorized (7)
    • US News (6,094)
    popular posts

    Hayden Christensen’s Best Films According To Rotten Tomatoes

    Hayden Christensen rose to fame after his appearance as Anakin Skywalker in the Star Wars…

    15 Worst Best Picture Winners

    March 4, 2024

    Prime Day 2.0 looks to be falling short of Amazon’s summer sale

    October 12, 2022

    Elon Musk calls report that he exposed himself to a flight attendant ‘politically motivated’

    May 19, 2022
    Archives
    Browse By Category
    • Books (1,203)
    • Business (6,105)
    • Events (48)
    • Film (6,042)
    • Lifestyle (4,145)
    • Music (6,156)
    • Politics (6,104)
    • Science (5,459)
    • Technology (6,036)
    • Television (5,725)
    • Uncategorized (7)
    • US News (6,094)
    About Us

    We are a creativity led international team with a digital soul. Our work is a custom built by the storytellers and strategists with a flair for exploiting the latest advancements in media and technology.

    Most of all, we stand behind our ideas and believe in creativity as the most powerful force in business.

    What makes us Different

    We care. We collaborate. We do great work. And we do it with a smile, because we’re pretty damn excited to do what we do. If you would like details on what else we can do visit out Contact page.

    Our Picks

    Hakeem Jeffries Calls Trump The Dumbest President In History

    April 24, 2026

    Giants’ bizarre draft decisions leave star player frustrated as true needs go unfulfilled in first round

    April 24, 2026

    Porsche is adding an all-electric Cayenne coupe to its lineup

    April 24, 2026
    © 2026 New York Examiner News. All rights reserved. All articles, images, product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement unless specified. By using this site, you agree to the Terms & Conditions and Privacy Policy.

    Type above and press Enter to search. Press Esc to cancel.

    We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
    Cookie SettingsAccept All
    Manage consent

    Privacy Overview

    This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
    Necessary
    Always Enabled
    Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
    CookieDurationDescription
    cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
    cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
    cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
    cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
    cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
    viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
    Functional
    Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
    Performance
    Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
    Analytics
    Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
    Advertisement
    Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
    Others
    Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
    SAVE & ACCEPT