Everyone from tech companies to AI influencers is foaming at the mouth, attempting to get you to mainline AI into every aspect of your personal life. You are told you should outsource important decisions and allow these systems to rummage through all of your highly personal data so you can improve your life. Whatever that means. With the continued push of today’s AI technology even deeper into the systems we use daily, there will inevitably be a data-hungry push to personalize this experience. In other words, to use your highly personal, sensitive data to whatever ends a 3rd party company would like.
Although we may have a gut reaction that all of this doesn’t feel right and may be dangerous, we don’t have a good way of framing a conversation about the safety of these tools. The ultimate question many may have is, are these tools safe to use?
The answer to this question comes from analyzing both the technical and the human aspects. In this post, I’ll address the technical aspects of this question by introducing SPAR, a way of evaluating the technical safety attributes, and discuss what it takes to achieve a safe baseline.
Personal AI Assistants
Personal AI assistants are the next generation of AI-powered digital assistants, highly customized to individual users. Think of a more connected, omnipresent, and capable version of Siri or Alexa. These tools will be powered by multimodal large language models (LLMs).
People will most likely use the term Personal AI (yuck) for this in the future. I think this is for two reasons. First, AI influencers will think it sounds cooler. Second, people don’t like to think they need assistance.
Personalization
Personalization makes technology more sticky and relevant to users, but the downside is that it also makes individual users more vulnerable. For personal AI assistants this means granting greater access to data and activities about our daily lives. This includes various areas such as health, preferences, and social activities. Troves of data specific to you will be mined, monetized, and potentially weaponized (overtly or inadvertently) against you. Since this system knows so much about you, it can nudge you in various directions. Is the decision you are about to make truly your decision? This will be an interesting question to ponder in the coming years.
Is the decision you are about to make truly your decision?
Safe To Use?
Answering whether a personal AI assistant is safe to use involves looking at two sets of risks: technical and human. You can’t evaluate the human risks until you’ve addressed the technical ones. This should be obvious because technical failings can cause human failings.
On the other hand, this isn’t about striving for perfection either. Just like drugs have acceptable side effects, these systems have side effects as well. Ultimately, evaluating the side effects vs the benefits will be an ongoing topic. If a technical problem with a drug formula causes an excess mortality rate, you can’t begin to address its effectiveness in treating headaches.
SPAR – Technical Safety Attributes
Let’s take a look at whether, from a technical perspective, an assistant is safe to use. Before introducing the categories, it needs to be said that the system as a whole needs to exhibit these attributes. Assistants won’t be a single thing but an interwoven connection of data sources, agents, and API calls, working together to give the appearance of being a single thing.
For simplicity’s sake, we can define the technical safety attributes in an acronym, SPAR. This acronym stands for Secure, Private, Aligned, and Reliable. I like the term SPAR because humans will spar not only with the assistant but also with the company creating it.
There is no such thing as complete attainment in any of these attributes. For example, there is no such thing as a completely secure system, especially as complexity grows. Still, we do have a sense of when something is secure enough for the use case, and the product maker has processes in place to address security in an ongoing manner. Each of these categories needs to be treated the same way.
Secure
Although this category should be relatively self-explanatory, in simple terms, the system is resistant to purposeful attack and manipulation. These assistants will have far more access to sensitive information about us and connections to accounts we own. The assistant may act on our behalf since we delegate this control to the assistant. Having this level of access means there needs to be a purposeful effort built into the assistant to protect the users from attacks.
Typically, when users have an account compromised, it is seen as more of an annoyance to the user. They may have to change their password or take other steps, but ultimately, the impact is low for many. With the elevated capability of these assistants, there is an immediate and high impact on the user.
Private
Simply put, a system that doesn’t respect the privacy of its users cannot be trusted. It is almost certain that your hyper-personalized AI assistant won’t be a hyper-personalized private AI assistant. Perverse incentives are at the core of much of the tech people use daily, and data is gold. In fact, it seems the only people who don’t value our data are us.
Your hyper-personalized AI assistant won’t be a hyper-personalized private AI assistant.
Imagine if you had a parrot on your shoulder that knew everything about you, and whenever anyone asked, they just blurted out what they had learned. Now, imagine if that parrot had the same access as you have to all your accounts, data, and activities. This isn’t far off from where we are headed.
Your right not to incriminate yourself won’t extend to your assistant, so it could be that law enforcement interrogates your assistant instead of you. Since your assistant knows so much about you and your activities, it happily coughs up not only what it knows but also what it thinks it knows. Logs, interactions, and conversations could be collected and used against you. Even things that may not be true but are inferred by the system can also be used against you.
Aligned
AI alignment is a massive topic, but we don’t need a deep dive here. What we mean by alignment in hyper-personalized assistants is that they take actions that align with your goals and interests. The your here refers to you, the user, not the company developing the assistant. So many of the applications and tools we use daily aren’t serving our best interests but the interests of the company making them. However, this will have to be the case in the context of personal AI assistants. Too much is at stake.
These tools will take action and make recommendations on your behalf. In a way, they are acting as you. You need to know that actions taken or even nudges imposed upon you are in your best interest and align with your wishes, not any outside entity’s wishes. Given the complete lack of visibility in these systems, this will be hard to determine, even in the best of cases.
Reliable
A system that isn’t reliable isn’t safe to use. It’s almost as simple as that. If the brakes in your car only worked 90% of the time, we would assume they were faulty, even though 90% seems to be a relatively high percentage.
The problem here is that other factors can often mask issues with reliability. For example, if we get bad data and never verify the accuracy, we won’t know that the system is unreliable. Quite often, in our fast-moving, attention-poor environments, we don’t know when our information is unreliable.
Additional Notes on SPAR Attributes
SPAR attributes aren’t simply features that can be attained and assumed to maintain their status in perpetuity. These features must be consistently re-evaluated as the system matures, updates, and adds new functionality. You can see this in Social Media. Back in 2007 and 2008, when I was researching social media platforms, these were mostly issues with the technology. However, if you look at the dangers of social media today, the technology is fairly robust, and we encounter human dangers.
Of course, startups can also be acquired, opening new dangers to people’s information and actions taken. The startup with a strong data privacy or alignment stance can become a big tech company that doesn’t respect your privacy and emphasizes its own goals.
It’s important to realize that none of these categories have been attained to an acceptable level today despite the constant hype surrounding the technology. There is no doubt that today’s technology, with all of its flaws, will be repackaged and marketed as Tomorrow’s Tools.
SPAR Attainment
Once a system has SPAR attainment, which means it properly addresses SPAR attributes, then we can consider the technology to have an acceptably safe baseline. That certainly doesn’t answer our question about whether the technology is safe to use, but what it does do is give us a safe baseline to further evaluate the potential human dangers and impacts.
Conclusion
I hope this post provides a useful starting point for discussing personal AI safety, which is about to become a massively important topic. As AI gets more personal, we must evaluate potential tradeoffs and set boundaries. We can’t do this until the technical safety attributes are accounted for.
To add to the complication, the speed at which these tools are created and the lack of configuration options makes that nearly impossible. Unfortunately, it will remain in this state for quite some time. Still, if organizations address SPAR attributes, it makes it much easier to consider having a safe baseline from which to provide further explorations of safety.
Historically, attackers have targeted large, centralized systems that only represent a small amount of an individual user’s data. This is high value for attackers, but it has a low impact on individual users. This will morph in the coming years. Hyppönen’s Law needs an update in the AI era because in a world of highly personalized AI, if it’s smart, you’re vulnerable.
Hyppönen’s Law needs an update in the AI era because in a world of highly personalized AI, if it’s smart, you’re vulnerable.
If we are not careful, we are about to enter an era of software development, where we replace known, reliable methods with less reliable probabilistic ones. Where methods such as prompting a model, even with context, can still lead to fragility causing unexpected and unreliable outputs. Where lack of visibility means you never really know why you receive the results you receive, and making requests over and over again becomes the norm. If we continue down this path, we are headed into a brave new world of degraded performance.
Scope
Before we begin, let’s set the perspective for this post. The generative AI I’m covering in this post is related to Large Language Models (LLMs) and not other types of generative AI. This post focuses on building software meant to be consumed by others. Products and applications deployed throughout an organization or to delivered to customers. I’m not referring to experiments, one-off tools, or prototypes. Although, buggy prototype code can have an odd habit of showing up in production because a function or feature just worked.
This post isn’t about AI destroying the world or people dying. It’s about the regular applications we use, even in a mundane context, just not being as good. The cost of failure doesn’t have to be high for the points in this post to apply. I’m saying this because, in many cases, the cost may be low. People probably won’t die if your ad-laden personalized horoscope application fails occasionally. But that doesn’t mean users won’t notice, and there won’t be impacts.
Our modern world runs on software, and we are training people that buggy software should be expected.
Our modern world runs on software, and we are training people that buggy software should be expected, and making requests repeatedly is the norm, setting the expectation that this is just the price paid in modern software development. This approach is bad, and the velocity at all costs mantra is misguided.
Let me be clear because I’m sure this will come up. I’m not anti-AI or anti-LLM or anything of the sort. These tools have their uses and can be incredibly beneficial in certain use cases. There are also some promising areas, such as the ability of LLMs to, generate, read and understand code and what that means for software development in the coming years. It’s still early. So in no way am I claiming that LLMs are useless. I’m trying to address the hype, staying in the realm of reality and not fantasy. The truth today is that maximizing these tools for functionality instead of being choosy is the problem and there are costs associated.
Software Development
Software development has never been perfect. It’s always been peppered with foot guns and other gotchas, be it performance or security issues, but what it lacked elegance, it made up in visibility and predictability. Developers had a level of proficiency with the code they wrote and an understanding of how the various components worked together to create a cohesive service, but this is changing.
Now, you can make a bunch of requests to a large language model and let it figure it out for you. No need to write the logic, perform data transformations, or format the output. You can have a conversation with your application before having it do something and assume the application understands when it gives you the output. What a time to be alive!
There’s no doubt that tools like ChatGPT increased accessibility to people who’ve never written code before. Mountains of people are creating content showing, “Look, Mom, I wrote some code,” bragging that they didn’t know what they were doing. I’ve seen videos of University Professors making the same claims. This has and will continue to lead to many misunderstandings about problems people are trying to solve and the data they are trying to analyze. Lack of domain expertise and lack of functional knowledge about how systems work is a major problem but not the focus of this post.
As a security professional, inexperienced people spreading buggy code makes me cringe (look at the Web3 space for examples), but It’s not all bad. In some ways, this accessibility is a benefit and may lead to people discovering new careers and gaining new opportunities. Also, small experiments, exploration, or playing around with the tools are absolutely fine. It’s how you discover new things. However, inefficiencies, errors, and lack of reliability aren’t dealbreakers in these cases. But what happens when this mindset is taken to heart and industrialized into applications and products that impact business processes and customers?
Degraded Performance
There’s a new approach in town. You no longer have to collect data, ensure it’s labeled properly, train a model, perform evaluations, and repeat. Now, in hours, you can throw both apps and caution to the wind as you deploy into production!
This above is a process outlined by Andrew Ng in his newsletter and parroted by countless content creators and AI hustle bros. It’s the kind of message you’d expect to resonate, I mean, who wouldn’t like to save months with the added benefit of removing a whole mountain of effort in the process? But, as with crypto bros and their Lambos, if it sounds too good to be true, it probably is.
Let’s look at a few facts. Compared to more traditional approaches:
LLMs are slow
LLMs are inefficient
LLMs are expensive ($)
LLMs have reliability issues
LLMs are finicky
LLMs can and do change (Instability)
LLMs lack visibility
Benchmarking? Measuring performance?
Pump the Brakes
Traditional machine learning approaches can have much better visibility into the entire end-to-end process. This visibility can even include how a decision or prediction was made. They can also be better approaches for specific problems in particular domains. These approaches also make it far easier to benchmark, create ensembles, perform cross-validation, and measure performance and accuracy. Everyone hates data wrangling, but you learn something about your data, given all that wrangling. This familiarity helps you identify when things aren’t right. Having visibility into the entire process means you can also identify potential issues like target leakage or when a model might give you the right answer but for the wrong reasons, helping avoid a catastrophe down the road.
The friction in more traditional machine learning is a feature, not a bug, making it much easier to spot potential issues and create more reliable systems.
The friction in more traditional machine learning is a feature, not a bug
Lazy Engineering
On the surface, letting an LLM figure everything out may seem easier. After all, Andrew Ng claims something similar. In his first course on Deeplearning.ai ChatGPT Prompt Engineering for Developers He mentions using LLMs to format your data as well as using triple backticks to avoid prompt injection attacks. Even the popular LangChain library instructs the LLM to format data in the same way. Countless others are creating similar tutorials flooding the web parroting this point. Andrew is a highly influential person who’s helped countless people with this training by making machine learning more accessible. With so many people telling others what they want to hear, as well as the accessibility of tools like LangChain, this will have an impact, and it’s not all positive.
One of the goals of software engineering should be to minimize the number of potential issues and unexpected behaviors an application exhibits when deployed in a production environment. Treating LLMs as some sort of all-capable oracle is a good way to get into trouble. This is for two primary reasons, lack of visibility and reliability.
Black Boxes
A big criticism of deep learning approaches has been their lack of transparency and visibility. Many tools have been developed to try and add some visibility to these approaches, but when maximized in an application, LLMs are a step backward. A major step backward if you count things like OpenAI’s Code Interpreter.
The more of your application’s functionality you outsource to an LLM, the less visibility you have into the process. This can make tracking down issues in your applications when they occur almost impossible. And when you can track problems down, assuming you can fix them, there will be no guarantee that they stay fixed. Squashing bugs in LLM-powered applications isn’t as simple as patching some buggy code.
Right, Probably
LLMs are being touted as a way to take on more and more functionality in the software being built, giving them an outsized role in an application’s architecture. Any time you replace a more reliable deterministic method with a probabilistic one, you may get the right answer much of the time, but there’s no guarantee you will. This means you could have intermittent failures that impact your application. In more extreme cases, these failures can cascade through a system affecting the functionality of other downstream components.
For example, anyone who has ever asked an LLM to return a single-word result will know that sometimes it doesn’t, and there’s no rhyme or reason why. It’s one of the classic blunders of LLMs.
So, you may construct a prompt stating only to return a single word, True or False, based on some request. Occasionally, without warning and even with the temperature set to 0, it will return something like the following:
The result is True
Not the end of the world, but now translate this seemingly insignificant quirk into something more impactful. Your application expected a result from an LLM formatted in a certain way. Let’s say you wanted the result formatted in JSON. Now, your application receives a result that isn’t JSON or maybe not properly formatted JSON, creating an unexpected condition in your application.
Suppose we combine this reliability issue with the lack of visibility. In that case, it can lead to some serious issues that may be intermittent, hard to troubleshoot, and almost impossible to fix without reengineering. In a more complex example, maybe you’ve sent a bunch of data to an LLM and asked it to perform a series of actions, some including math or counting, and return a result in a particular format. A whole mess of potential problems could result from this, all of which are outside your control and visibility.
Not to mention a big point many gloss over, deploying your application in production isn’t the end of your development journey. It may be the beginning. This means you will need to perform maintenance, troubleshooting, and improvements over time. All things LLMs can make much more difficult when functionality is maximized.
To summarize, outsourcing more and more application functionality to an LLM means that your application becomes less modular and more prone to unexpected errors and failures. These are issues that Matthew Honnibal also covers in his great article titled Against LLM Maximalism.
The Slow and Inefficient Slide
In some use cases, it may not matter if it takes seconds to return a result, but for many, this is unacceptable. Having multiple round trips and sending the same data back and forth may be necessary due to different use cases because a character changed or because of context window size, which also adds to the inefficiency. Even if the use case isn’t critical and inefficiencies can be tolerated, that’s not the end of the story.
There are still environmental impacts due to this inefficiency. It requires much more energy consumption to have an LLM perform tasks than more traditional methods. For example, searching for a condition with a RegEx vs. sending large chunks of data to an LLM and letting the LLM try and figure it out. The people ranting and raving constantly about the environmental impacts of PoW cryptocurrency mining are incredibly silent on the energy consumption of AI, even as former crypto miners turn their rigs toward AI. Think about that next time you want to replace a method like grep with ChatGPT or generate a continuous stream of cat photos with pizzas on their head.
LLMs Change and So Do You
Any check of social media will show that at the time of this writing, there have been quite a few people claiming that GPT-4 is getting worse. There’s also a paper that explores this.
There’s some debate over the paper and some of the tests chosen, but for the context we are discussing in this post, the why an LLM might change isn’t relevant. Whether changes are because of cost savings, issues with fine-tuning, upgrades, or some other factor aren’t relevant when you count on these technologies inside your application. This means your application’s performance can worsen for the same problems, and there isn’t much you can do about it but hope if you are consuming a provider’s model (OpenAI, Google, Microsoft, etc.) This can also lead to instability due to the provider requiring an upgrade to a newer version of the hosted model, which may lead to degraded performance in your application.
Demo Extrapolation
The problem is that none of the constraints and issues may surface for demos and cherry-picked examples. Actually, the results can look positive. Positive results in demos are a danger in and of themselves since this apparent working can mask larger issues in real-world scenarios. The world is filled with edge cases, and you may be running up a whole bunch of technical debt.
Hypetomisim and Sunken Cost
There’s a sense that technology and approaches always get better. Whether this is from Sci-fi movies or just because people get a new iPhone every year, maybe a combination of both. Approaches can be highly problem or domain-specific and not generalize to other problem areas or at least not generalize well. We don’t have an all-powerful single AI approach to everything. Almost nobody today would allow an LLM to drive their car. However, some have hooked them up to their bank accounts. Yikes!
But you can detect an underlying sense of give it time in people’s discussions on this topic. Whenever you point out issues you usually get, well GPT-5 is gonna… This goes without saying that ChatGPT is based on a large language model, and large language models are trained on what people write, not even what they actually think in certain cases. They perform best on generative tasks. On the other hand, tasks like operating a car have nothing to do with language. Sure, you could tell the car a destination, but every other operation has nothing to do with language. It’s true that LLMs can also generate code, but do you want your car to generate and compile code while driving it? Let me answer that. Hell no. Heed my words, maybe not this use case, but something in the same order of stupid is coming.
Developing buggy software in the hopes that improvements are on the way and outside your control is not a great strategy for reliable software development.
Developing buggy software in the hopes that improvements are on the way and outside your control is not a great strategy for reliable software development. I’ve heard multiple stories from dev teams that they continue to run buggy code with LLM functionality and make excuses for apparent failures because of sunken costs.
The hype has led to a new form of software development that appears to be more like casting a spell than developing software. The AI hustle bros want you to believe everything is so simple and money is just around the corner.
Now’s a good time to remind everyone that fantasy sells far better than reality. Lord of the Rings will always sell more books than one titled Eat Your Vegetables. Trust me, as most of my posts are along the lines of Eat Your Vegetables posts, I make no illusions that every AI hustler’s Substack making nonsensical and unfounded predictions is absolutely crushing me in page views.
Engineering Amnesia
In a development context, we may forget that better methods exist or allow ourselves to reintroduce known issues that cause cascading failures and catastrophic impacts on our applications. This isn’t without precedent.
The LAND attack came back in Windows XP after it was known and already mitigated in previous Windows OSs. ChatGPT plugins are allowed to execute in the context of each other’s current domains, even though we’ve seen time and time again how this violates security. The Corrupted Blood episode was a failure to understand how the containment of a feature could cause catastrophic damage to an application, so much so that it forced a reset. And, of course, don’t even get me started on the Web3 space. I mean, who wouldn’t want tons of newly minted developers creating high-risk financial products without knowledge of known security issues? It was fascinating to see security issues in high-impact products for which standard, boring, and known security controls would have prevented them. These are just a couple off the top of my head, and there are many more.
As new developers learn to use LLMs to perform common tasks for which we have better, more reliable methods, they may never become aware of these methods because their method just kind of works.
Avoiding Issues
The perplexing part of all of this is that these issues are pretty easy to avoid, mainly by thinking carefully about your application’s architecture and the features and components you are building. Let me also state that these issues won’t be solved by writing better prompts.
Reliability and visibility issues won’t be solved by writing better prompts
There’s the perception that using an LLM to figure everything out is easier than other methods. On the surface, it may appear that there’s some truth to that. It’s also easier to spend money on a credit card than to make the money to pay the bill. So, it’s the case that you may be kicking the can down the road. Avoiding these issues isn’t hard, and a bit of thought about your application and its features will go a long way.
Look at your application’s features. Break these features down into functional modules. The goal of breaking down these features into smaller components is to evaluate the intended functionality to determine the best approach for the given feature. At a high level, you could ask a few questions with the goal of determining the right tool for the processing task.
Does the function require a generative approach?
Are there existing, more reliable methods to solve the problem?
How was the problem solved before generative AI? (Potential focusing question if necessary)
Is there a specific right or wrong answer to the problem?
What happens if the component fails?
These questions are far from all-encompassing, but they are meant to be simple and provide some focus on individual component functionality and the use case. After all, LLMs are a form of generative AI, and therefore, they are best suited to generative tasks. Asking if there’s a specific right or wrong answer is meant to focus on the output of the function and consider if a supervised learning approach may be a better fit for the problem.
We have reliable ways of formatting data, so it’s perplexing to see people using LLMs to perform data formatting and transformations, especially since you’ll have to perform those transformations every time you call the LLM. Asking these questions can help avoid issues where improperly formatted data can cause a cascading issue.
Example
Let’s take a simple example. You want a system that parses a stream of text content looking for mentions of your company. If your company is mentioned, you want to evaluate the sentiment around the mention of your company. Based on that sentiment, you’d like to write some text addressing the comment and post that back to the system. We break this down into the following tasks below.
For parsing, analysis, and text generation steps, it would be tempting to collapse all of them together and send them to an LLM for processing and output. This would be maximizing the LLM functionality in your application. You could technically construct a prompt with context to try and perform these three activities in a single shot. That would look like the following example.
In this case, you have multiple points of failure that could easily be avoided. You’d also be sending a lot of potentially unnecessary data to the LLM in the parsing stage since all data, regardless of whether the company was mentioned, would be sent to the LLM. This can substantially increase costs and increase network traffic, assuming this was a hosted LLM.
You are also counting on the LLM to parse the content given properly, then properly analyze and then, based on the two previous steps, properly generate the output. All of these functions happen outside of your visibility, and when failures happen, they can be impossible to troubleshoot.
So, let’s apply the questions mentioned in the post to this functionality.
Parsing
Does the function require a generative approach? No
Are there existing, more reliable methods to solve the problem? Yes, more traditional NLP tools or even simple search features
Is there a specific right or wrong answer to the problem? Yes, we want to know for sure that our company is mentioned.
What happens if the component fails? In the current LLM use case, the failure feeds into the following components outside the visibility of the developer, and there’s no way to troubleshoot this condition reliably.
Analysis
Does the function require a generative approach? No
Are there existing, more reliable methods to solve the problem? Yes, more traditional and mature NLP tasks for sentiment analysis
Is there a specific right or wrong answer to the problem? Yes
What happens if the component fails? In the current LLM use case, the failure feeds into the following text generation component outside the developer’s visibility, and there’s no way to troubleshoot this condition reliably.
Text Generation
Does the function require a generative approach? Yes
Are there existing, more reliable methods to solve the problem? LLMs appear to be the best solution for this functionality.
Is there a specific right or wrong answer to the problem? No, since many different texts could satisfy the problem
What happens if the component fails? We get text output that we don’t like. However, since the previous steps happen beyond the developer’s visibility, there’s no way to troubleshoot failures reliably.
Revised Example
After asking a few simple questions, we ended up with a revised use case. This one uses the LLM functionality for the problem it’s best suited for.
In this use case, only the text generation phase uses an LLM. Only confirmed mentions of the company, along with the sentiment and the content necessary to write the comment, are sent to the LLM. Much less data flows to the LLM, lowering cost and overhead. By using more robust methods, much less can go wrong as well, and less likely to have cascading failures affecting downstream functions. When something does go wrong in the parsing or analysis stages, troubleshooting is much easier since you have more visibility into those functions. So, breaking down this functionality in such a way means that failures can be more easily isolated and addressed, and you can improve more reliably as the application matures.
Now, I’m not claiming that this is a development utopia. A lot can still go wrong, but it’s a far more consistent and reliable approach than the previous example.
After talking with developers about this, some of the questions I’ve received are along the lines of, “There are better methods for my task, so if we can’t cut corners, then why use an LLM at all?” Yes, that’s a good question, a very good question, and maybe you should reevaluate your choices. This is my surprised robot face when I hear that.
LLMs Aren’t Useless
Once again, I’m not saying that LLMs are useless or that you shouldn’t use them. LLMs fit specific use cases and classes of functionality that applications can take advantage of. For many tasks, there’s the right tool for the job or at least a righter tool for the job. However, this right tool for the right job approach isn’t what’s being proposed in countless online forums and tutorials. I’m concerned with a growing movement of using LLMs as some general-purpose application functionality for tasks that we already have much more reliable ways of performing.
Conclusion
Will we inhabit a sprawling landscape of digital decay where everything rests on crumbling foundations? Probably not. But there will be a noticeable shift in the applications we use on a daily basis. But it doesn’t have to be. By being choosy and analyzing functionality where LLMs are best suited, you can make more reliable and robust applications, and the environment will also thank you.