Close Menu
UK Daily: Tech, Science, Business & Lifestyle News UpdatesUK Daily: Tech, Science, Business & Lifestyle News Updates
    What's Hot

    We can’t solve every problem with new rules

    February 18, 2026

    CCTV images released following incident on Glasgow train

    February 18, 2026

    Extra apartments approved for major new social housing development in West Belfast

    February 18, 2026
    Facebook X (Twitter) Instagram
    Trending
    • We can’t solve every problem with new rules
    • CCTV images released following incident on Glasgow train
    • Extra apartments approved for major new social housing development in West Belfast
    • Three-bedroom chalet-style house in Dartford goes on market
    • Tesla dodges 30-day suspension in California after removing Autopilot
    • Running AI models is turning into a memory game
    • Mohammed Jameel Welcomes Research Collaboration To Improve Rural Water Access In India
    • Draining wetlands produces substantial emissions in the Canadian Prairies
    • London
    • Kent
    • Glasgow
    • Cardiff
    • Belfast
    Facebook X (Twitter) Instagram YouTube
    UK Daily: Tech, Science, Business & Lifestyle News UpdatesUK Daily: Tech, Science, Business & Lifestyle News Updates
    Subscribe
    Wednesday, February 18
    • Home
    • News
      1. Kent
      2. London
      3. Belfast
      4. Birmingham
      5. Cardiff
      6. Edinburgh
      7. Glasgow
      8. Liverpool
      9. Manchester
      10. Newcastle
      11. Nottingham
      12. Sheffield
      13. West Yorkshire
      Featured

      ‘Miniature’ mountain creature with ‘squeaker’-like call discovered as new species

      Science November 9, 2023
      Recent

      We can’t solve every problem with new rules

      February 18, 2026

      CCTV images released following incident on Glasgow train

      February 18, 2026

      Extra apartments approved for major new social housing development in West Belfast

      February 18, 2026
    • Lifestyle
      1. Celebrity
      2. Fashion
      3. Food
      4. Leisure
      5. Social Good
      6. Trending
      7. Wellness
      8. Event
      Featured

      Full List of Past Boyfriends – Hollywood Life

      Celebrity February 17, 2026
      Recent

      Full List of Past Boyfriends – Hollywood Life

      February 17, 2026

      How Miley Cyrus Will ‘Honor’ Her – Hollywood Life

      February 17, 2026

      Celebrities Pay Tribute to the ‘Godfather’ Star – Hollywood Life

      February 17, 2026
    • Science
    • Business
    • Sports

      Reaction from Gills boss Gareth Ainsworth after League 2 defeat

      February 18, 2026

      League 2 match report from the SMH Group Stadium

      February 17, 2026

      Matchday Live: Chesterfield v Gillingham

      February 17, 2026

      Gillingham play Chesterfield away in League Two clash as manager Gareth Ainsworth hopes to keep promotion hopes alive

      February 17, 2026

      Vote for your star man in KentOnline’s team of the week

      February 16, 2026
    • Politics
    • Tech
    • Property
    • Press Release
    UK Daily: Tech, Science, Business & Lifestyle News UpdatesUK Daily: Tech, Science, Business & Lifestyle News Updates
    Home » Running AI models is turning into a memory game

    Running AI models is turning into a memory game

    bibhutiBy bibhutiFebruary 18, 2026 Tech No Comments3 Mins Read
    Facebook Twitter LinkedIn WhatsApp Telegram
    Share
    Facebook Twitter LinkedIn Telegram WhatsApp


    When we talk about the cost of AI infrastructure, the focus is usually on Nvidia and GPUs — but memory is an increasingly important part of the picture. As hyperscalers prepare to build out billions of dollars’ worth of new data centers, the price for DRAM chips has jumped roughly 7x in the last year.

    At the same time, there’s a growing discipline in orchestrating all that memory to make sure the right data gets to the right agent at the right time. The companies that master it will be able to make the same queries with fewer tokens, which can be the difference between folding and staying in business.

    Semiconductor analyst Doug O’Laughlin has an interesting look at the importance of memory chips on his Substack, where he talks with Val Bercovici, chief AI officer at Weka. They’re both semiconductor guys, so the focus is more on the chips than the broader architecture; the implications for AI software are pretty significant too.

    I was particularly struck by this passage, in which Bercovici looks at the growing complexity of Anthropic’s prompt-caching documentation:

    The tell is if we go to Anthropic’s prompt caching pricing page. It started off as a very simple page six or seven months ago, especially as Claude Code was launching — just “use caching, it’s cheaper.” Now it’s an encyclopedia of advice on exactly how many cache writes to pre-buy. You’ve got 5-minute tiers, which are very common across the industry, or 1-hour tiers — and nothing above. That’s a really important tell. Then of course you’ve got all sorts of arbitrage opportunities around the pricing for cache reads based on how many cache writes you’ve pre-purchased.

    The question here is how long Claude holds your prompt in cached memory: You can pay for a 5-minute window, or pay more for an hour-long window. It’s much cheaper to draw on data that’s still in the cache, so if you manage it right, you can save an awful lot. There is a catch though: Every new bit of data you add to the query may bump something else out of the cache window.

    This is complex stuff, but the upshot is simple enough: Managing memory in AI models is going to be a huge part of AI going forward. Companies that do it well are going to rise to the top.

    And there is plenty of progress to be made in this new field. Back in October, I covered a startup called Tensormesh that was working on one layer in the stack known as cache optimization.

    Techcrunch event

    Boston, MA
    |
    June 23, 2026

    Opportunities exist in other parts of the stack. For instance, lower down the stack, there’s the question of how data centers are using the different types of memory they have. (The interview includes a nice discussion of when DRAM chips are used instead of HBM, although it’s pretty deep in the hardware weeds.) Higher up the stack, end users are figuring out how to structure their model swarms to take advantage of the shared cache.

    As companies get better at memory orchestration, they’ll use fewer tokens and inference will get cheaper. Meanwhile, models are getting more efficient at processing each token, pushing the cost down still further. As server costs drop, a lot of applications that don’t seem viable now will start to edge into profitability.



    Source link

    Featured Just In Top News
    Share. Facebook Twitter LinkedIn Email
    Previous ArticleMohammed Jameel Welcomes Research Collaboration To Improve Rural Water Access In India
    Next Article Tesla dodges 30-day suspension in California after removing Autopilot
    bibhuti
    • Website

    Keep Reading

    We can’t solve every problem with new rules

    CCTV images released following incident on Glasgow train

    Extra apartments approved for major new social housing development in West Belfast

    Three-bedroom chalet-style house in Dartford goes on market

    Tesla dodges 30-day suspension in California after removing Autopilot

    Draining wetlands produces substantial emissions in the Canadian Prairies

    Add A Comment
    Leave A Reply Cancel Reply

    Editors Picks

    89th Utkala Dibasa Celebration Brings Odisha’s Vibrant Culture to London

    April 8, 2024

    US and EU pledge to foster connections to enhance research on AI safety and risk.

    April 5, 2024

    Holi Celebrations Across Various Locations in Kent Attract a Diverse Range of Community Participation

    March 25, 2024

    Plans for new Bromley tower blocks up to 14-storeys tall refused

    December 4, 2023
    Latest Posts

    Subscribe to News

    Get the latest sports news from NewsSite about world, sports and politics.

    Advertisement

    Recent Posts

    • We can’t solve every problem with new rules
    • CCTV images released following incident on Glasgow train
    • Extra apartments approved for major new social housing development in West Belfast
    • Three-bedroom chalet-style house in Dartford goes on market
    • Tesla dodges 30-day suspension in California after removing Autopilot

    Recent Comments

    1. Register on Anycubic users say their 3D printers were hacked to warn of a security flaw
    2. Pembuatan Akun Binance on Braiins Becomes First Mining Pool To Introduce Lightning Payouts
    3. tadalafil tablets sale on The market is forcing cloud vendors to relax data egress fees
    4. cerebrozen reviews on Kent director of cricket Simon Cook adapting to his new role during the close season
    5. Glycogen Review on The little-known town just 5 miles from Kent border with stunning beaches and only 600 residents
    The News Times Logo
    Facebook X (Twitter) Pinterest Vimeo WhatsApp TikTok Instagram

    News

    • UK News
    • US Politics
    • EU Politics
    • Business
    • Opinions
    • Connections
    • Science

    Company

    • Information
    • Advertising
    • Classified Ads
    • Contact Info
    • Do Not Sell Data
    • GDPR Policy
    • Media Kits

    Services

    • Subscriptions
    • Customer Support
    • Bulk Packages
    • Newsletters
    • Sponsored News
    • Work With Us

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    © 2026 The News Times. Designed by The News Times.
    • Privacy Policy
    • Terms
    • Accessibility

    Type above and press Enter to search. Press Esc to cancel.

    Manage Cookie Consent
    To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
    Functional Always active
    The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
    Preferences
    The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
    Statistics
    The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
    Marketing
    The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
    • Manage options
    • Manage services
    • Manage {vendor_count} vendors
    • Read more about these purposes
    View preferences
    • {title}
    • {title}
    • {title}