You might wonder why AI companies are working on seemingly simple and unimportant advancements in AI when there are much more significant problems to solve. Why would companies trying to create AGI get sidetracked by focusing on potentially already-solved problems? A couple of examples are OpenAI’s voice cloning, Google’s VLOGGER, and Microsoft’s VASA-1.This research, for many, only seems to have use cases for fakes and frauds, but I believe this work signals something much deeper: that we could be near the peak of LLM capabilities. With AGI off the table, it is time to go deep and get very personal.
Peak LLM
Although you can do some cool things with LLMs, and we’ll no doubt see further applicability in other use cases, it’s a far cry from their touted value. You know what I’m talking about, the more impactful than the printing press crowd that still seems to swarm every conversation on the topic. These people talk about 10x, 100x, and even 1000x productivity boosts with LLMs. Compared to bold AGI claims and nonsense productivity levels, a 10% efficiency gain seems inconsequential.
The Wall Street Journal reported that the AI industry spent $50 billion on the Nvidia chips used to train advanced AI models last year but brought in only $3 billion in revenue. Ouch! There is reporting on the dismal outlook for generative AI, and some foresee a new Dotcom crash.
People have become more skeptical of claims (as they should), and it seems that many more people are noticing. You can’t believe the demos you see. Many are highly controlled or manufactured altogether. Even the SORA demo that everyone lost their minds over wasn’t what it purported to be.
LLMs are under-delivering on their overhyped promises.
I don’t know what to think about the economic angle. It’s not my area of expertise. I just now know that LLMs are under-delivering on their overhyped promises. Where leads economically, I don’t know.
Many LLMs, including open-source models like Llama 3, are catching up to GPT-4. Even if they don’t have the exact level of performance, they are close, which should tell us something. We may be hitting peak LLM capabilities. This means GPT-5 won’t be AGI or exponentially better than GPT-4. GPT-5 may be better than GPT-4 in some ways, but it is far from a groundbreaking explosion of capabilities.
This lack of performance isn’t going unnoticed at the companies building the technology either. This is why a new approach is needed by companies looking to monetize AI investments further. There’s about to be a shift away from a focus on AGI (although they’ll still talk about it) and ever more capable models to you. That’s right, you.
You’re Next
Just because we may be hitting peak LLM capabilities doesn’t mean things will stop. When you’ve reached the limit of going wide (general), you go deep (personal). This will be a sleight of hand shifting from purely training larger models on more data, creating more capabilities in a broad sense, to deeper, more personal integration.
These companies will make it all about you, not because you are the most important aspect, but because you are where the data is. With systems that are closer to you and more integrated with your data and activities, these companies are hoping to make the products more sticky, with the beneficial exhaust of having access to all your data.
The hope is that an epiphany will sprout from your screen as you find the same tools you previously could take or leave now indispensable. Or maybe even fool yourself with the tech, as the public launch of ChatGPT showed. ChatGPT became a social contagion not because people found it so indispensable but because we are bad at constructing tests and good at filling in the blanks.
But don’t take my word for it. Sam Altman has already started pivoting in this direction. Here’s what he says about the goal of AI: “A super-competent colleague that knows absolutely everything about my whole life, every email, every conversation I’ve ever had, but doesn’t feel like an extension.” That’s pretty creepy. But there’s more.
You can make the tech more sticky by allowing people to personalize and customize in more advanced ways. Technology like voice cloning and animating faces supports this customization aspect. When you can choose whoever you want to be your assistant’s AI avatar, you can anthropomorphize it more. How would you feel if a random stranger used your face and voice as their personal assistant? What about a family member? Is this creepier still? Oddly enough, it serves no purpose for the individual user. It doesn’t make the tool any smarter or more capable. It only exists to manipulate us or allow us to manipulate ourselves.
In the end, you’ll be blamed for LLMs’ lack of success by not allowing them to plunge deeply enough into your life. There’s a saying that if you don’t pay for something, then you are the product. Well, in the age of generative AI, you can pay for something and still be the product. The future’s so bright 😎
Even Deeper
AI companies are doing their best to make this technology unavoidable. We are getting AI whether we want it or not. It’s being baked into the very foundations of our computing systems, and even your humble mouse hasn’t escaped this integration.
How you deactivate these integrations will be anyone’s guess, as the flood of new integrations infects every application imaginable. A security check will be due soon, but security issues aren’t the only problem. As I’ve said, we are creating a brave new world of degraded performance. In an attempt to make hard things easier, we may make easy things hard.
Applications of narrow AI are cool and can be incredibly useful for certain tasks, but does it warrant hooking everything up to LLMs and hoping for the best? I don’t think so, and this approach is fairly misguided, opening us up to unnecessary risks.
Conclusion
We must be much more selective before blindly accepting deep data access and personal integration for these tools. This can start with a few relatively simple questions. What do we hope to gain from this access? How will this provide a measurable benefit? And, most importantly, are the trade-offs worth it? The answers to these questions will be different for everyone.
In many cases, it appears that for the small price of your soul, you can appear and sometimes feel marginally better in some aspects but be measurably worse in others. Does that sound like a good trade?
So, let’s talk about posthumanism for a moment. Yes, posthumanism is actually a thing, and it can sound like a rather odd movement to cheerlead. After all, we as humans aren’t done being human yet. Posthumanism’s adherents are anxiously awaiting the next stage of human evolution, homo technologicus. Yes, it’s also a real thing. I’ve also heard terms like techno-progressivism thrown around. As serious as some of these people may be, their concepts are surrounded by techno-utopian bullshit.
As amazingly silly as this sounds, their views aren’t far off from those of many people these days. Everyone from pure techno-utopians to level-headed “normal” people is kinda thinking the same thing. Let’s slap a bunch of tech inside our bodies and see what happens.
My goal with this post isn’t to address all the narratives or poke even more holes in the logic. I’m writing a book covering this and other topics. For this post, I want to point out a few glaringly obvious issues that should get more attention. The point of this post is that there is no free lunch regarding human augmentation.
Human Augmentation Must Be Universally Good, Right?
I never cease to be shocked at the casual nonchalance of people discussing slapping a bunch of tech inside their bodies, melding our brains with machines. I realize there’s a cool sci-fi aspect to it, but in real life, we have things called consequences. It’s different if there is a cognitive or motor impairment that the technology corrects for, another thing entirely when no impairment exists.
As a security researcher, I can’t bring myself to imagine these systems not being vulnerable to attack and, almost as bad, being used to manipulate us. We like to think of ourselves as the pillars of agency, but in reality, we can be nudged to do all sorts of things, resembling more automatons than humans.
This means that any of these systems would need to have a safe technical baseline. For a basic framework of a safe baseline, see the SPAR categories I’ve outlined previously.
I could address many other technical issues, but for the sake of this conversation, let’s call it a perfect technical implementation. A cognitive symbiosis of mind and machine without any technical issues or glitches. It is a completion of the techno-utopian dream.
Let’s look at why, even in a perfect implementation, there is still no free lunch.
Socrates
To look forward, let’s look back. This is Socrates. Totally not a fake photo, by the way.
Socrates has become a popular punching bag for the AI crowd. Apparently, dunking on a 5th-century BCE philosopher has become some sort of modern-day sick AI burn. So, what sin did Socrates commit that is so egregious to AI leaders today? He was against writing things down.
Socrates worried that writing things down would affect his memory, so he became a punching bag. However, what many don’t realize is that he wasn’t wrong. Writing things down can negatively affect your memory.
We can’t seem to imagine the past without viewing it through the lens of the present. People’s memories were far better in the past than they are today, even pre-social media and the attention apocalypse. It doesn’t take much thought to recognize this. In ancient times, when most people couldn’t read or write, the only place to store knowledge was in their heads. Even asking someone else, you were querying tribal knowledge stored in someone’s head. To his credit, Socrates stumbled onto cognitive offloading and recognized one of the effects.
Ultimately, we are better off for writing, and the benefit of writing things down far outweighs the benefits of a localized, tribal memory, even if individual personal memory is decreased. There are also other interesting effects of writing that Socrates missed, such as exploring thoughts and ideas and some of the memory-reinforcing effects. So, let’s forgive a 5th-century BCE philosopher their faults and focus on what he recognized for a moment: cognitive offloading.
Cognitive Offloading
Cognitive offloading is using physical action to alter the information processing requirements of a task to reduce cognitive demand. We all do this every day. If you’ve ever left yourself a note or set up a meeting in your calendar application, you’ve performed cognitive offloading.
This activity is beneficial since we only have so much cognitive capacity. It’s not just memory but decision-making skills as well. There’s a famous story about President Obama and why he only wore gray or blue suits. He was paring down his decisions.
I know it seems I’m making the posthumanist argument for them, but bear with me. Not all cognitive offloading is the same. In 2016, I heard the evolutionary biologist David Krakauer discussing cognitive artifacts on the Making Sense podcast. This was in the context of discussing complexity and stupidity. He referred to complimentary and competitive cognitive artifacts.
Without being too wordy, complementary cognitive artifacts help you create a model of the problem and are tools that rewire our brains to make problem-solving more efficient. These are things like maps, language, and even the abacus.
Competitive cognitive artifacts don’t augment our ability to reason but instead replace our ability to reason by competing with our own cognitive processes. Classic examples are the calculator or GPS navigation.
The interesting thing here is that complementary cognitive artifacts have imprinting and additional positive effects. For example, being proficient with maps increases spatial awareness. On the other hand, with competitive cognitive artifacts, you are probably worse off when the artifact is removed. For example, using GPS navigation systems degrades spatial awareness, so when it is removed, you are less capable than before.
I’m not arguing that we should destroy all calculators (or GPS navigation systems); I’m only pointing out the impacts of reduced cognitive function. It’s also interesting to consider that AI tools are almost universally competitive cognitive artifacts. We assume, wrongly, that there isn’t a cost to this augmentation. I mean, everything has tradeoffs in life. Technology is no different.
To avoid making this blog post a book a whole book, let’s look at memory.
Memory Storage
Most humans realize that memory is a limitation. Unless we are savants, there are only so many things we store in our heads. But we may be taking the offloading of memory too far. Let’s think about what we are actually doing. As humans, we are transitioning from knowing things to knowing where things are stored. We’ve treated this as universally beneficial without considering side effects.
We are transitioning from knowing things to knowing where things are stored
AI didn’t initiate this trend, but it has accelerated it, especially with systems like ChatGPT, which people use as oracles. This means the information we are retrieving may never have existed in biological memory in the first place and, more interestingly, may not be stored even after we retrieve it. Anyone who’s ever followed a YouTube tutorial on how to do something and, despite performing the task, had to review it again the next time can attest to this.
This brings up some interesting thought experiments. Is someone who doesn’t have any deep knowledge contained in their biological memory smart? After all, information on astrophysics is a search away. Would we say someone proficient at searching Google or prompting a language model is smart? Okay, let’s phrase the question a different way.
Is an average human + Google (or insert favorite AI tool here) smarter than Einstein or Von Neumann? After all, they have access to far more information far more quickly than either of those scientists ever did. Of course, the answer is no. We instinctively know there’s something more to knowledge and intelligence than merely knowing where data is stored or getting a summary from a document.
There’s no doubt that people may feel like Einstein, but that’s a topic for another day.
Human memory is getting worse, no doubt, due to technology. At the veterinary office I visit, I’ve seen people walk out of the exam room to use the restroom, go to the front desk, or go out to their car, and not remember which exam room they came out of. A clear degradation of spatial memory. These weren’t kids on TikTok or people staring down at their phones. People of all ages are represented.
But, not all memory tasks are straight lookup tasks, and memories spontaneously emerge. Sometimes, I bust out laughing when a memory pops into my head. This spontaneous surfacing has benefits, such as the creation of epiphanies and novel concepts creating a satisfaction that can’t be replicated with technology. What happens when this spontaneity disappears? Not only are we worse off, but it leads to more questions.
How do we develop novel ideas and concepts if we don’t have the right knowledge in our biological memory? It’s one thing to have knowledge and some novel concepts in memory and then explore external storage locations for further data. It’s another thing entirely to have no deep knowledge contained in biological memory and expect novelty to emerge because of access to external storage. I know the techno-utopians would say that we’ll build algorithms for this, but it’s a challenging problem and not the same thing and wouldn’t lead to the same results.
Humans + AI = Superhumans?
Human augmentation with AI is being sold as an intellectual get-rich-quick scheme, but the reality is gaining knowledge is hard. Sometimes, it is very hard, and there aren’t any shortcuts today, no matter how many prompts we create or documents we summarize. However, cognitive illusions are easy to come by. We end up fooling ourselves into thinking we know more than we do. Once again, AI didn’t start this trend. It’s merely the accelerant.
There’s a fundamental illusion clouding many people’s perceptions. Just as we can’t seem to view the past without the lens of the present, we can’t envision the future without using the same lens. We tend to assume we’ll keep our same faculties and gain more capabilities, resulting in some sort of win-win situation.
We mistakenly think human augmentation makes us superhuman, but in reality, it probably doesn’t. Despite knowing where information is stored and being able to perform some additional computational tasks, which may give us some superhuman capabilities in a few narrow areas, the reality is it may not make us superhuman overall and probably makes us worse. These additional capabilities will create very real and expanded blind spots and deficiencies. Of course, these won’t be identified until far too late, and everyone will claim not to have seen them coming.
These additional capabilities will create very real and expanded blind spots and deficiencies.
We haven’t even asked ourselves what we hope to get from this symbiosis or augmentation. There is just this generic sense of “enhancement,” but nothing overly specific. It’s one thing if the augmentation addresses some deficiency, such as reduced cognitive or motor function, but what are we addressing when a perfectly functioning human decides to augment themselves?
The reality is that when this symbiosis happens, we will become completely dependent on technology for far more than complex tasks; we will also be dependent upon it to function in our daily lives, even for simple tasks. This is because we will use the resources to offload even more cognitively, regardless of task complexity. Who wins in this scenario? Tech companies? Society? Us? At this point, will the technology still be working for us, or will we be working for the technology? More importantly, at what point do we stop being recognizable as humans?
Parting Thought
I’m not opposed to human augmentation or even being augmented in some way myself. But as an adult who has lived on planet Earth for a bit, I want to understand the tradeoffs. Understanding the costs is essential to determining whether the augmentation is worth it. It seems that in some cases, we may be stiffed with a hefty bill that we never would have agreed to ahead of time.
When it comes to being human, there are certain things we’d like to protect and certain things we are fine giving up. This will be different for each individual, but we all have this. These considerations will have to be part of our future decisions.
Our brains seek to free up resources and limit the amount of work they perform to create brain capacity for other tasks. In short, our brains seek to offload as much as possible. This is something we don’t consciously realize. It’s one of the reasons we prefer getting an answer to solving a problem. Our brains seek the offloading path, whether it’s helpful or not. This evolutionary quirk may have served us well in the past, but with technological advances, it may not serve us well in the future.
The movie Idiocracy is a cult classic that has been quoted more and more over the past few years. Here’s something to think about. It could be that Mike Judge got the future outcome of the movie’s setting right but just got the premise wrong. The only way the world of idiocracy could have come about is if highly capable AI had been in the background, making everything work and, of course, manufacturing Brawndo. Brawndo has electrolytes!
There has been some buzz over a new paper called Durably Reducing Conspiracy Beliefs Through Dialogs With AI. You can read the paper here. In their paper, they found a roughly 20% reduction in conspiracy beliefs, and this reduction was still in effect two months later. So far, so good.
To be fair to the paper’s authors, it has not been peer-reviewed. I’m also not directly refuting their results, but I am questioning their data, which affects the results. In this post, I’d like to highlight a few items that should raise eyebrows as well as give people some ideas for a more critical eye when it comes to data. Far too often, people read the abstract of papers like this and share the post, but we must give a more critical eye to research rather than just taking results for granted.
The Obvious
There are a couple of obvious observations here that I’ll quickly note and move on since it’s not the subject of this post.
If the outcome is true, then the inverse is probably true. So, if AI can break people’s conspiracy beliefs, it also has the power to implant them.
This wasn’t the default version of GPT-4. They used GPT-4 Turbo and specifically tailored it to have these dialogs.
This study was done outside the normal area where people with conspiracy beliefs would encounter such tools.
In an age of personal AI, people with conspiracy beliefs may have their own AI that they’d use to refute these tools. So, the AI vs AI fight club is probably closer than we think.
Now, let’s look at the bigger problem.
The Bigger Problem, Data
As soon as I saw this study, a huge question surfaced. Where did they find these conspiracy theorists?
Conspiracy theorists are typically suspect of government and academia. On a recent road trip, a truck passed me with a bumper sticker stating, “Not a Ph.D.” This was followed by several other bumper stickers claiming the election was stolen, Q something or other, and a few more that could be easily guessed. However, this person didn’t appear to claim the earth was flat, so I guess that’s a silver lining, but there’s still a catch.
Conspiracy beliefs are like Pringles. It’s hard to have just one. These beliefs make them more paranoid and suspect, especially in studies and academia. I find it hard to believe that these true believers would willingly participate in such a study and that a few rounds with a chatbot would reduce their beliefs.
In the real world, family members and loved ones have implemented similar strategies to no avail. These people have emotional ties and strong bonds, and even that wasn’t enough to change their minds. The only thing that helped was a completely different tactic, not refuting the beliefs but telling people that they were worried about them and that they cared for and loved them.
To change beliefs, one’s mind needs to be open to new possibilities. Conspiracy beliefs are powerful because they allow one to play both the victim and the hero at the same time. I wrote about this aspect back in 2020 with an article called The Cult of Conspiracy Theory. All of this is evidence of pretty strong cognitive distortions that conspiracy theorists hold. So, back to the question. Where did they find these conspiracists?
They used a product called CloudResearch Connect. This is a site that pays people to take surveys.
I read the website but didn’t do a deep dive into this company or examine its methods more closely since this wasn’t important for the points I’m making in this post.
Now that we know the source, a new question arises. Are over a thousand conspiracy theorists waiting to complete surveys on Connect? It’s possible but doubtful. Here is another question to ponder. Is it more likely that people with durable conspiracy beliefs are a part of CloudResearch Connect, or is it more likely that people looking for extra cash would be willing to answer questions with the persona of a conspiracy theorist?
When reduced to this question, the answer is fairly obvious. I also get it. It’s hard to find people to participate in studies when you are researching topics like this. Far too often, this research is done with participants made up of the student body—hardly a representative sample of the real world.
When Data is Dark
The participant data in this paper covers multiple categories of David Hand’s Dark Data. In his book, Dark Data: Why What You Don’t Know Matters, he covers 15 categories of what he calls dark data. It’s an illuminating read, and once you see data in these categories, you won’t be able to unsee them. Hence, despite reading the book four years ago, the categories stuck with me.
Immediately, I spotted at least three main categories of dark data.
Choosing just some cases
Self-Selection
Feedback and Gaming
Choosing Just Some Cases The researchers choose to count on Connect for their entire sample. The sample provided by Connect may not be representative of the real world. This may cause them to miss the effects of real people with conspiracy beliefs and instead have people posing as conspiracy theorists.
Self-Selection All the people filling out the surveys volunteered to be part of the study. There may be a stark difference in how people who decide to be part of the study answer the questions from those who do not. In this case, actual conspiracy theorists.
Feedback and Gaming This happens when the feedback itself influences the values of the data. In this case, getting paid to be part of the study. Participants may be more likely to answer favorably to certain questions, playing the role of a conspiracy theorist to get paid.
In fairness, the paper’s authors were aware of some of these issues and tried to account for automated responses and confidence in their conspiracy beliefs. This can be seen in the Materials and Methods section of the paper. I think, in the end, there just weren’t any true conspiracy theorists on this site. Well, that’s my theory anyway.
Most Surveys Results Are Garbage
Let me give an example of how to think about these scenarios more simply. If I stood in the frozen dessert section of the grocery store and asked every person buying ice cream if they liked ice cream, I’d be in the high 90% range for respondents. This would be different if I asked people at the entryway to the grocery store or on the busy streets of Manhattan. I’d also have to ask if my sample size was representative of the population I was measuring. Did I ask everyone walking into the grocery store or only ten people?
I saw a survey a few years ago that made broad statements about the cybersecurity industry worldwide by surveying less than 100 people. Is it possible for less than 100 to represent an industry with many different focus areas across many different industry verticals, in many different countries, and at various levels of seniority? Of course not, but this type of thing happens all of the time. Survey data quality has reduced dramatically in value over the years. This, combined with poor data collection, equals garbage.
For a quick gut check on surveys.
Does the sample size seem representative of the population?
Does it seem likely people would answer questions differently based on the collection?
Did respondents self-select?
Here’s an example to clarify the self-select question since it may be blurry. Say the International Pizza Lovers Association sent out a survey. As a pizza lover, I may be more likely to respond than someone who is lukewarm or doesn’t like pizza. So, I self-select based on my enthusiasm for the topic, thereby biasing the data collection. This can also be negative. Anyone who’s spoken at conferences and gotten feedback knows that the people most likely to fill out feedback forms are the ones who didn’t like that talk.
Conclusion
I think the manipulation of humans by bots is a fascinating topic that deserves more research. However, for any research results to hold, reliable sampling of the population being studied must be conducted. Otherwise, we may be fooling ourselves. There’s an ongoing replication crisis in academia across various disciplines. In this post, we examined one of the contributing factors to this crisis.
Everyone from tech companies to AI influencers is foaming at the mouth, attempting to get you to mainline AI into every aspect of your personal life. You are told you should outsource important decisions and allow these systems to rummage through all of your highly personal data so you can improve your life. Whatever that means. With the continued push of today’s AI technology even deeper into the systems we use daily, there will inevitably be a data-hungry push to personalize this experience. In other words, to use your highly personal, sensitive data to whatever ends a 3rd party company would like.
Although we may have a gut reaction that all of this doesn’t feel right and may be dangerous, we don’t have a good way of framing a conversation about the safety of these tools. The ultimate question many may have is, are these tools safe to use?
The answer to this question comes from analyzing both the technical and the human aspects. In this post, I’ll address the technical aspects of this question by introducing SPAR, a way of evaluating the technical safety attributes, and discuss what it takes to achieve a safe baseline.
Personal AI Assistants
Personal AI assistants are the next generation of AI-powered digital assistants, highly customized to individual users. Think of a more connected, omnipresent, and capable version of Siri or Alexa. These tools will be powered by multimodal large language models (LLMs).
People will most likely use the term Personal AI (yuck) for this in the future. I think this is for two reasons. First, AI influencers will think it sounds cooler. Second, people don’t like to think they need assistance.
Personalization
Personalization makes technology more sticky and relevant to users, but the downside is that it also makes individual users more vulnerable. For personal AI assistants this means granting greater access to data and activities about our daily lives. This includes various areas such as health, preferences, and social activities. Troves of data specific to you will be mined, monetized, and potentially weaponized (overtly or inadvertently) against you. Since this system knows so much about you, it can nudge you in various directions. Is the decision you are about to make truly your decision? This will be an interesting question to ponder in the coming years.
Is the decision you are about to make truly your decision?
Safe To Use?
Answering whether a personal AI assistant is safe to use involves looking at two sets of risks: technical and human. You can’t evaluate the human risks until you’ve addressed the technical ones. This should be obvious because technical failings can cause human failings.
On the other hand, this isn’t about striving for perfection either. Just like drugs have acceptable side effects, these systems have side effects as well. Ultimately, evaluating the side effects vs the benefits will be an ongoing topic. If a technical problem with a drug formula causes an excess mortality rate, you can’t begin to address its effectiveness in treating headaches.
SPAR – Technical Safety Attributes
Let’s take a look at whether, from a technical perspective, an assistant is safe to use. Before introducing the categories, it needs to be said that the system as a whole needs to exhibit these attributes. Assistants won’t be a single thing but an interwoven connection of data sources, agents, and API calls, working together to give the appearance of being a single thing.
For simplicity’s sake, we can define the technical safety attributes in an acronym, SPAR. This acronym stands for Secure, Private, Aligned, and Reliable. I like the term SPAR because humans will spar not only with the assistant but also with the company creating it.
There is no such thing as complete attainment in any of these attributes. For example, there is no such thing as a completely secure system, especially as complexity grows. Still, we do have a sense of when something is secure enough for the use case, and the product maker has processes in place to address security in an ongoing manner. Each of these categories needs to be treated the same way.
Secure
Although this category should be relatively self-explanatory, in simple terms, the system is resistant to purposeful attack and manipulation. These assistants will have far more access to sensitive information about us and connections to accounts we own. The assistant may act on our behalf since we delegate this control to the assistant. Having this level of access means there needs to be a purposeful effort built into the assistant to protect the users from attacks.
Typically, when users have an account compromised, it is seen as more of an annoyance to the user. They may have to change their password or take other steps, but ultimately, the impact is low for many. With the elevated capability of these assistants, there is an immediate and high impact on the user.
Private
Simply put, a system that doesn’t respect the privacy of its users cannot be trusted. It is almost certain that your hyper-personalized AI assistant won’t be a hyper-personalized private AI assistant. Perverse incentives are at the core of much of the tech people use daily, and data is gold. In fact, it seems the only people who don’t value our data are us.
Your hyper-personalized AI assistant won’t be a hyper-personalized private AI assistant.
Imagine if you had a parrot on your shoulder that knew everything about you, and whenever anyone asked, they just blurted out what they had learned. Now, imagine if that parrot had the same access as you have to all your accounts, data, and activities. This isn’t far off from where we are headed.
Your right not to incriminate yourself won’t extend to your assistant, so it could be that law enforcement interrogates your assistant instead of you. Since your assistant knows so much about you and your activities, it happily coughs up not only what it knows but also what it thinks it knows. Logs, interactions, and conversations could be collected and used against you. Even things that may not be true but are inferred by the system can also be used against you.
Aligned
AI alignment is a massive topic, but we don’t need a deep dive here. What we mean by alignment in hyper-personalized assistants is that they take actions that align with your goals and interests. The your here refers to you, the user, not the company developing the assistant. So many of the applications and tools we use daily aren’t serving our best interests but the interests of the company making them. However, this will have to be the case in the context of personal AI assistants. Too much is at stake.
These tools will take action and make recommendations on your behalf. In a way, they are acting as you. You need to know that actions taken or even nudges imposed upon you are in your best interest and align with your wishes, not any outside entity’s wishes. Given the complete lack of visibility in these systems, this will be hard to determine, even in the best of cases.
Reliable
A system that isn’t reliable isn’t safe to use. It’s almost as simple as that. If the brakes in your car only worked 90% of the time, we would assume they were faulty, even though 90% seems to be a relatively high percentage.
The problem here is that other factors can often mask issues with reliability. For example, if we get bad data and never verify the accuracy, we won’t know that the system is unreliable. Quite often, in our fast-moving, attention-poor environments, we don’t know when our information is unreliable.
Additional Notes on SPAR Attributes
SPAR attributes aren’t simply features that can be attained and assumed to maintain their status in perpetuity. These features must be consistently re-evaluated as the system matures, updates, and adds new functionality. You can see this in Social Media. Back in 2007 and 2008, when I was researching social media platforms, these were mostly issues with the technology. However, if you look at the dangers of social media today, the technology is fairly robust, and we encounter human dangers.
Of course, startups can also be acquired, opening new dangers to people’s information and actions taken. The startup with a strong data privacy or alignment stance can become a big tech company that doesn’t respect your privacy and emphasizes its own goals.
It’s important to realize that none of these categories have been attained to an acceptable level today despite the constant hype surrounding the technology. There is no doubt that today’s technology, with all of its flaws, will be repackaged and marketed as Tomorrow’s Tools.
SPAR Attainment
Once a system has SPAR attainment, which means it properly addresses SPAR attributes, then we can consider the technology to have an acceptably safe baseline. That certainly doesn’t answer our question about whether the technology is safe to use, but what it does do is give us a safe baseline to further evaluate the potential human dangers and impacts.
Conclusion
I hope this post provides a useful starting point for discussing personal AI safety, which is about to become a massively important topic. As AI gets more personal, we must evaluate potential tradeoffs and set boundaries. We can’t do this until the technical safety attributes are accounted for.
To add to the complication, the speed at which these tools are created and the lack of configuration options makes that nearly impossible. Unfortunately, it will remain in this state for quite some time. Still, if organizations address SPAR attributes, it makes it much easier to consider having a safe baseline from which to provide further explorations of safety.
Historically, attackers have targeted large, centralized systems that only represent a small amount of an individual user’s data. This is high value for attackers, but it has a low impact on individual users. This will morph in the coming years. Hyppönen’s Law needs an update in the AI era because in a world of highly personalized AI, if it’s smart, you’re vulnerable.
Hyppönen’s Law needs an update in the AI era because in a world of highly personalized AI, if it’s smart, you’re vulnerable.
Something that may have gone unnoticed in recent months is that there has been a brewing backlash against physical AI-powered devices. The most recent example of this was the crowd attacking the Waymo vehicle in San Francisco, but this is far from the only example.
I started noticing this trend with AI-powered food delivery vehicles. This was a bit odd since they seemed to be pretty good at failing themselves, often found tipped over, going the wrong direction, and even, in one case, committing a hit-and-run.
Although many would write this cheering crowd off, envisioning Enoch in the hands of the destroyers, this would be a mistake. Those of us working in tech can roll our eyes and dismiss outlandish reporting and other nonsense but think about the countless people who have a steady stream of this clogging their newsfeeds. There is an underlying mood to these activities that cuts much deeper than tech hate or the realities of technology innovations from the past. In this post, we explore what’s bubbling beneath the surface.
Quick Update
I haven’t had much time to write content lately. This isn’t for lack of material since there’s been a firehose of things to write about. I’ve been working on something much, much longer than blog posts, so that’s consumed quite a bit of my time. Also, you can check out my post on something I call the AI Solutions Risk Gap on the Modern CISO blog. Here, I break down what really matters with AI Risk and give leaders some topics for consideration.
Why The Backlash?
I believe what we see here is the beginning of something bubbling to the surface. This is the inevitable outcome when hype meets uncertainty. AI hype is putting all of humanity on notice, and humanity notices.
AI hype is putting all of humanity on notice, and humanity notices.
So, you may wonder why attack self-driving cars or delivery robots. It’s because these are the physical manifestations of AI in the real world. After all, it’s kind of hard to punch ChatGPT in the face. These devices represent symbols of a future that doesn’t need humans at all, erasing humanity from the equation. It’s a mistake to attribute this to Luddism or tech hate.
First of all, anyone invoking the Luddites should read Brian Merchant’s book Blood in the Machine. Second, the Luddites’ concerns about tech affected their industry and the social and political environment surrounding it. It’s an entirely different scenario when discussing a large swath of humanity in multiple industry verticals and positions. Unfortunately, this uncertainty is primarily driven by exaggerated news reports and speculation.
Examples like this gem from Yahoo Finance that shows few cuts attributed to AI but speculates that people are lying despite direct quotes to the contrary. There are also framing issues and more to critique in the article, but most people won’t notice any of this. All they’ll see is, once again, AI is coming for your job.
People have a sense that the technology isn’t as good as it claims to be and yet continually see reporting to the contrary. They also see the launch of what I refer to as shitty AI gadgets, like the Humane Pin and the Rabbit, two devices that apparently investors and the media love, but the rest of the world, not so much.
AI is now being shoved down our throats in absolutely everything, whether we want it or not. Even Mozilla is scaling back to focus on adding AI to beloved Firefox, despite the fact that absolutely no Firefox user actually wants it. Microsoft is cramming it into every corner of the Windows operating system. Deep AI integration is bad for both security and privacy and despite this being known, the push continues. Nothing is sacred anymore.
To pile on, people are being told their jobs are in danger, which they are, but not from super capable AI, but from overzealous business leaders who hope that the tech catches up faster than they’ll have to backfill positions or rehire the people they let go. This is despite underwhelming performance when they do demo products or launch experiments.
There is no doubt that, in general, AI technologies will continue to make progress, solve problems, and become more capable. We will even get to AGI. But in the short term, we are being sold a bill of goods before these companies even get the technology working, much less working effectively. It’s like thinking you are buying a Ferrari, but when you take delivery, it’s a wooden go-cart with wet paint and the word Ferrari on the back.
It’s like thinking you are buying a Ferrari, but when you take delivery, it’s a wooden go-cart with wet paint and the word Ferrari on the back.
You even have people like Sam Altman telling people ChatGPT will evolve in uncomfortable ways, wanting to push this technology further into your personal life with far more access to your data. No wonder people protested outside the OpenAI’s office. Give us more of your data so we can replace—I mean, help you. The reality may not be so cut and dry, but that’s what’s in people’s heads, and they don’t like it.
Tech companies hope to employ their standard brute force playbook and steamroll through the problems, but I think it’s far more challenging this time. The AI field, in general, will bring us a lot of advancements. LLMs are undoubtedly useful for some tasks but remain overhyped and won’t get us to AGI. LLMs are the Diet Coke of AGI. Just one calorie is not nearly enough.
Human Manipulation May Win The Day
If all this wasn’t depressing enough, if there’s one thing we know for certain, it’s that humans are easily manipulated. We can reliably reproduce these results. Companies will start to employ more manipulation techniques to avoid larger issues and ease adoption.
These can be subtle and often go unnoticed. For example, have you noticed how the responses are displayed while using ChatGPT? They are completed across the screen as though someone else is typing back to you. This makes it feel more human.
I remember reading an article years ago about home assistant robots that were in development and how people didn’t like them. Then, the developers projected simple facial expressions on the robot’s face, and people warmed up to them. They were the same product that now had a simple face, with no further capabilities added.
To take this further, look at the image I used as the featured image for this post. It may make you feel sorry for the robot despite being imaginary and completely manufactured. The robot never existed, the human never existed, and the scenario didn’t exist. Yet, we still can’t help feeling sorry for the poor robot despite the fact it may have been a homicidal, mass-murdering robot whose sole purpose was to kill as many people as possible.
So, if we apply subtle manipulation to the current situation, imagine the delivery robot having a statement printed on it that says, “If you see me in trouble, please help me.” This is a statement from a piece of technology asking you, the human, for help. Since most people help when asked, people may be likely to stand up a tipped-over device and less likely to kick or destroy a device requesting help.
Or a wilder scenario, projecting a frowny face on the windows when a car is attacked and a voice that says, “Stop, you’re hurting me.” These techniques may reduce the number of incidents by manipulating the humans coming in contact with the technology through techno-social engineering.
Our world is already filled with priming, subtle manipulations, and nudges
Our world is already filled with priming, subtle manipulations, and nudges. Companies building this technology won’t find ways to make the situation more equitable for humans; honestly, that’s not their job. However, they will find ways to manipulate us into believing it’s in our best interest, ease adoption, and minimize backlash. Anthropomorphism and other human manipulation techniques will be employed to serve the company’s goals. On the other hand, this is something we should all be concerned about.
One example of manipulation is this article. Notice the mental tricks Sam Altman employs. By claiming AI is dangerous in this way, he’s creating a humble brag about its incredible capability. Claiming to want regulation makes him appear reasonable and concerned. He gets to play the hero and the victim at the same time. It’s a lot less genuine when you realize this is a push toward regulatory capture. I’m sure there’s an Onion article in here, somewhere like, Man Creating AI Says It’s Dangerous And Wishes There Was A Way to Stop Himself.
Business Leaders
Business leaders play a critical role here and need to be more critical of claimed advances in the AI space. When putting pressure on internal developers, they need to understand that the biggest companies in the world are struggling with operationalizing generative AI. So it’s reasonable to assume you’ll have challenges as well.
Business leaders also need to be far more critical of vendor AI claims. Keep in mind that demos are staged and offer known variables for vendors to present during sales meetings. These situations don’t match your organization and the unique data and challenges you’ll encounter. When evaluating a demo, ensure that it’s evaluated on your data with problems that you encounter. Also, ask the vendor about challenges you’ll have, as well as things that their tooling doesn’t do well. If you don’t get good answers, run as fast as you can in the opposite direction.
Common things I hear are, “Why would make up stuff about their products?” This is typically when I spit my drink out. Dig in and verify claims. Just because a product may work in one environment doesn’t mean it will work in yours.
Conclusion
Although we all love to rage against the machine, the problem is that we are all a part of it. In the near future, we’ll start to see more applications of techno-social engineering. We also need to be far more critical of the news stories we consume. There’s a deluge of junk research and sensational news stories out there. Staying level-headed and asking the right questions can help keep you grounded on where the realities are.
Update
I wrote this article before seeing Brian Merchant’s article. You can read that here: https://www.bloodinthemachine.com/p/torching-the-google-car-why-the-growing He digs a bit deeper into the self-driving vehicle aspect, so we have a similar theme but a different focus. It’s well worth the read. Also, I learned from that article that people were destroying e-scooters as well.
I wanted to take a moment to address an obvious issue. In my last post, I discussed making sense of AI predictions and gave a framework to help, but how do you make sense of the absurd? After writing that post, I’ve seen imaginings of not a 10x productivity boost from AI but 100x and even 1000x. Not with some futuristic, cutting-edge technology yet to be developed, but with what we have today. Now is a good time to remind these people that the “x” after the number means times, although it seems to have morphed into a generic representation of “better.”
There is truth to their statements when they claim humanity isn’t prepared for 100x or 1000x productivity increases. This is true. We also aren’t prepared for a stampede of unicorns frying us with their laser beam eyes. A 100x or 1000x productivity increase is such a large number that it makes us mentally check out, insert the word “better,” and open the door for pointless pontification. Not to mention, this level of performance increase would be pointless in almost every context.
The Reality
Let’s think about this realistically, with some numbers. A widget factory produces a 1000 widgets per day. It then uses some magical AI dust to gain a 100x output boost, that’s 100,000 widgets a day. Now, at 1000x, that’s 1,000,000 widgets per day. In one year at full capacity, they’ll have 365,000,000 widgets vs. the previous output of 365,000. I hope they have a whole lot of storage.
Now, think of the developer who previously wrote 500 lines of code per day and now outputs half a million lines of code daily. Or, take the case of a prolific author who writes a book per year. Now, they are writing a thousand books per year. What should be clear by now is not only the absurdity of all this but also the larger problem: where are your 1000x new customers?
Output on this scale is meaningless without a massive increase in consumption. A 1000x productivity boost is pointless and possibly burdensome without a 1000x increase in consumption, aka customers and demand. Even with a scale-back in production, it’s still burdensome at this rate. I mean, unless you only plan on firing up the factory for a single day a year.
These things remind me of the Philip K. Dick short story Autofac, where automated factories keep replicating themselves and producing goods that nobody wants. Maybe the milk really is pizzled.
Y, Tho?
So, why are people claiming that 100x or 1000x productivity boosts could be on the horizon? Well, I can only assume because those numbers are bigger than 10x 🤷♂️ Even though 10x is already a massive productivity increase, bigger claims, bigger hype. People aren’t going to read your blog posts anymore if you are still 10x’ing. In the hype game, it’s go big or go home.
There is something about spelling out claims like this that makes the obvious flaws shine through. Putting numbers to this demonstrates the absurdity of the claims and should highlight a significant flaw in the logic. Of course, maybe there’s another likely scenario. People using ChatGPT to write their blog posts aren’t confronted with reality since they aren’t actually reasoning through the content they are churning out. They are too busy 10x’ing to realize flaws in their statements. Oh well, that’s a blog post for a different day.
People using ChatGPT to write their blog posts aren’t confronted with reality since they aren’t actually reasoning through the content they are churning out.
Real World Performance Boosts are Happening
There are certainly situations in which organizations can and even are gaining productivity increases using todays AI technology. These may even be doubling, tripling, or, in some rare cases, approaching 10x productivity increases, but these are highly specific, situational, and typically related to tasks and not the entire chain. Many of these areas are creative in nature: copywriting, stock photos, VoiceOver, and even video game design. All of these areas have seen massive productivity boosts from generative AI. Of course, we won’t get into a conversation on quality, but good enough is fine for many of these tasks.
Conclusion
There is no doubt that AI has the potential to transform our world in a wide variety of ways. This is both with the technology we have today as well as the technology we’ve yet to invent. There will be plenty of surprises, advancements, and things we didn’t see coming. However, we have to stop giving oxygen to people making outlandish and absurd claims. Remember, they aren’t making these claims for your benefit.
Happy New Year! 2024 is in full swing, and already, people are coming out with their big, over-the-top AI predictions. This was to be expected, but I think it’s helpful to calibrate the conversation so that people have a point of reference in their attempts to make sense of these predictions. Even though an overwhelming majority of AI predictions are nothing more than nonsense, it’s helpful to have a simple framework to quickly evaluate the claims in a prediction and determine if even very smart people you respect are saying nothing at all.
AI
The term AI has become so generic that it encompasses technology and approaches that have been both invented and yet to be invented. Anything falling under the umbrella of “automation” is now called AI. Given this generic nature, how do you even know what people are referring to when making claims and predictions?
In my presentations at conferences and events in 2023, I told the audience the first step in making sense of AI advances is understanding what people are referring to in the first place. In this post, I hope to add some clarity around this topic.
Published Human AI Predictions
Let me tell you something you may already know: most people’s published AI predictions are nonsense, falling into the categories of pure guesses, wishful thinking, or pointless parroting. These predictions only serve to drum up hype and marketing buzz. Predictions are often highly biased based on who is making them. Even my own 2024 AI predictions are biased based on the fact that I’m a security researcher. Worse, quite a few have absolutely no visibility or exposure to the technology trends they are predicting.
A fun experiment is to look at the person generating these predictions and their position at the company and guess their prediction. This can be done with a high degree of accuracy. Still, there are others who, when you read their thoughts on AI and their predictions, can tell they’ve never used the technology they are discussing. Yes, please give me more of that person’s predictions.
There are perverse incentives all the way around, from people making the predictions to the media organizations pumping them out for clickbait. It’s no wonder people are confused and unable to understand where things are.
Speaking of perverse incentives, there are the AI Influencers. Don’t get me started. Anyone who missed the cryptocurrency craze and wants the full crypto bro experience only needs to subscribe to their content because everything is 10x-ing around there.
So, what about the leaders of the AI companies? They must be safe, right? Well… eh.
It would be best if you took Sam Altman’s or any other AI organization’s leaders’ AI predictions with a grain of salt. Not to be too cynical, but they are never giving an answer in service of your benefit. Even their critical assessments of AI, such as their concern about AI risks, boost the image of how powerful AI is and further fuel hype. It’s like a human extinction, humble brag.
It’s like a human extinction, humble brag.
So, before you take a prediction seriously, you have to consider a couple of things.
Would this person have some insight into the trend?
Do they have an incentive to say or not say certain things?
Now, let’s move on to our framework.
Framework for Making Sense of Predictions
Given that absolutely everyone is making AI predictions, how do we even begin to make sense of these predictions and whether they should be taken seriously? Although there is no hard and fast answer, a simple framework can help reduce the nonsense.
When evaluating AI predictions, look at a couple of factors. What specific technology is being referred to, what is the prediction timeframe, and did they provide a reason? You can filter out most of the absurdity using only technology precision and timeframe.
Technology Precision
Timeframe
ReasonProvided
Weak
Wide
No
Strong
Narrow
Yes
To make sense of this, let’s look at an example below. This was published a couple of days ago by a well-known security professional. Now, this is a person I respect, and I even enjoyed their last book, so I’m not singling them out. It just happened to be the latest example I’ve seen.
“AI changes everything,” PERSON tells MEDIA on a video call. “The AI revolution is going to be bigger than the internet revolution.”
It’s easy to scratch your head and ask, “WTF does this statement even mean?” Well, it means absolutely nothing, but in terms of our framework, this would be classified as Weak, Wide, and No. What AI approach is this person referring to? When will It be bigger than the internet? There are countless examples of similar overly generic declarations, and I’d wager that a vast majority of people making these claims couldn’t answer with any precision or timeframe.
Without qualifying the precision and a designated timeframe, the statements say absolutely nothing. It’s window dressing for drivel. You might as well say, “Cyborgs will change everything, and the cyborg revolution will have a massive impact on humanity.”
I don’t think that many doubt at this point that some AI technology will transform humanity at some point, even in the near future. Science fiction writers have been envisioning this for quite some time. This is why details matter.
Here’s another generic goodie. Yeah, but what and when bro? Remember, people can purposefully wield being vague as a superpower. If you are vague, you can fuel the hype, wiggle into alternate descriptions, and never worry about being called out on anything specific when it doesn’t happen. This is the Nostradamus playbook: be vague so that everything fits. Musk’s statement was made a few days before Grok was launched, so there’s that.
As a general rule, you can toss out anything falling into the Weak Precision and Wide Timeframe category, as it can mean everything and nothing simultaneously. Most AI predictions I see fall into this category.
Let’s look at some more predictions. Here’s another example below.
These predictions weren’t accurate either, but at least they have strong precision and a narrow timeframe, even without reason. I know he’s generically referring to “AI” here, but through another context, he’s referring to Large Language Models. Sometimes, you have to infer a bit of surrounding context with these. None of these came to pass, but it illustrates another important lesson.
We reward people for being bold, not for being right.
In our modern, social media-driven, self-promoting, nonstop content-producing world, we reward people for being bold, not for being right. People making wild, baseless predictions continue gaining followers and directing people to funnels and newsletters that further allow them to monetize content, which seems (and is) entirely backward from what you’d expect. It’s part of what I call the influencer hype circle. It’s possible to outrun your predictions. People have such short attention spans you can make precise predictions, and a few hours later, people won’t remember, much less months or even a year later.
So, why do we get sucked into these things?
There are a couple of psychological tricks at play. We become desensitized to things we see more often, making more outrageous claims seem not so outrageous. We also tend to assume there must be a consensus the more we see something. The more hype-confirming things we see, the more people say hype-confirming things, which leads to us seeing more hype-confirming things, and over and over, building a false consensus. Hype doesn’t have to be pro-AI hype either; existential risk hype or hype around AI-generated misinformation can fall in the same category.
Note: In fairness, you have to cut people some slack. There can be a big difference between what you tell someone in an interview and what gets cut up and printed. Also, media stories try to be brief. They are likely to print predictions without providing the reason. So YMMV.
Reason
We end by evaluating the reason. By looking at the reason, you can determine whether to put stock in the predictions and the person making them.
Does the reason make sense?
Does it apply to what’s being predicted?
Keep in mind the reason can be overly vague as well. “Technology is moving so fast that it’s inevitable.” That’s not a good reason. “Everyone seems to be saying…” Also, not a good reason. We got an early glimpse of this with ChatGPT. So many people made so many claims about how it was going to be more impactful than the printing press. When you’d listen to their reason, it was because it gave them a recipe in the style of Shakespeare or something similar. It’s hard to put a lot of stock into people who are basically fooling themselves.
Reasons are going to vary along with the amount of included details, so use your best judgment in your evaluation.
Conclusion
Hopefully, this post provided some basic ideas for filtering some of the noise. There are truly insightful people who share their thoughts, drowned out by a cacophony of voices saying nothing much. We are also nowhere near peak AI predictions, so using a basic framework can help reduce the noise and get you to what is important faster. Welcome to 2024.
AI-focused use cases applied to news delivery have picked up recently. It’s no secret why news-based use cases would be in the crosshairs of AI since it seems like a natural fit. News is text; LLMs do text, so why not let LLMs do the news? Boom. Obviously, if you’ve read about any of the many failures in applying AI to the news media space, you’ll know it’s not that easy.
There isn’t a shortage of problems in the news media space, either. Trust in the media remains near a record low, and reporting on every event is transformed into an editorial. So, it’s not like there aren’t real problems to address. Unfortunately, many of these aren’t technological problems.
Channel 1
Recently, something that caught my eye was Channel1.ai. Channel 1 doesn’t seem to solve any problems at all. In fact, it’s poised to create a few, that is, if it ever gets off the ground. Channel 1 bills itself as a “personalized” global news network powered by generative AI. Let’s dig in.
Channel 1 doesn’t seem to solve any problems at all.
Not Solving Problems
Looking at Channel 1’s offering, it’s hard to see any problems their solution addresses. There are still human editors and producers involved, as well as human fact-checkers. What they seem to be addressing is the pesky news anchor. Who knew that was the real problem in the media? I’m sure those media trust numbers are about to skyrocket.
But, Why?
It’s easy to look at Channel 1’s offering, scratch your head, and ask, “Why?” Like so many AI use cases these days, It appears to be nothing more than an attempt at a novelty. In an age where people are throwing spaghetti at the wall and seeing what sticks, this is yet another plate of spaghetti. However, the novelty wears off almost as quickly as it’s presented in our modern world.
There’s a current rush to put AI in everything, whether you want it or not, whether it’s necessary or not, and whether it solves a problem or not. Startups are counting on the fact that innovations can be elusive, and it’s not always obvious ahead of time. For example, many questioned why they would ever need to do anything other than talk on a cell phone. They are hoping you didn’t know you needed it. However, these use cases fall short of other successful, elusive innovations.
Creating Problems
Solutions like Channel 1 can potentially create more problems with news media delivery. Strangely, we can look at the world today and deduce that we can solve problems by creating even more filter bubbles, but that’s part of Channel 1’s pitch. The personalization of content down to the fake news reporter delivering it to you means that people can continue to live in their own highly customized bubble.
A glance at Channel 1’s description might lead people to believe one of the benefits is the ability to translate content into different languages in real time, but this isn’t the benefit that it seems. How do you check for translation issues in real-time? Beyond any real-time translation issues, there’s another problem with locality.
People are interested in international news stories but care about local news, which makes sense. These are stories affecting your community. How is Channel 1 going to verify all of these local stories, especially ones outside the United States and in languages other than English? Are they going to employ people in various regions throughout the world who natively speak these languages? Let me answer that for you, no. The human in the loop will be nothing more than a meat sack automaton pushing the publish button.
The human in the loop will be nothing more than a meat sack automaton pushing the publish button.
When there’s no footage of something, Channel 1 will create an AI-generated image to depict what it “thinks” the event would look like. Yikes.
Channel 1 will use AI to generate images and videos of events where "cameras were not able to capture the action." It likens this to how a courtroom sketch "is not a literal depiction of actual events" but helps audiences understand them.
Comparing an AI-generated image to a courtroom sketch is delusional, especially since a courtroom sketch is done by an artist who witnessed the events and often sketched them as they happen. This isn’t an AI making up things that look similar to an event. Even though these images are labeled as AI-generated, this is a terrible idea because it’s creating an image of reality that never existed.
News agencies often use b-roll footage and footage from other events in their news stories today. For example, using footage from a protest a year ago for a story of a current protest. I think this is a terrible practice that should be discontinued, and it is just one cog of many in the current collapse of trust in news media. We are partly to blame for this because we want more exciting and entertaining reporting than merely regurgitating the facts.
Getting It Wrong
Whether human or AI-based, misinformation making it into a seemingly legitimate news source is a recipe for disaster. I’ve pulled no punches in my criticism of the dangers of AI-generated misinformation and deepfakes. However, one of the ways misinformation can gain legitimacy is when it’s disseminated through legitimate news sources. This is why legitimate news organizations should be highly critical of AI use cases in their environments and understand that failures can have problematic impacts and further loss of confidence by the public.
As newsrooms shrink and resources become more scarce, the ability of news organizations to hold each other accountable becomes nonexistent.
Here is another thing to think about. As newsrooms shrink and resources become more scarce, the ability of news organizations to hold each other accountable becomes nonexistent. Many news sources have just become aggregators for other people’s content. In some cases, a single news story by a single reporter may get amplified and spread through countless other news sites. Modern news organizations don’t have the resources to verify truths on the ground, so they are just left repeating content from other reporters, who may not be acting in good faith. It’s another way misinformation can propagate and amplify. In this case, too, Channel 1 is contributing to the problem.
The Real Fake News
I think Channel 1 will fail and possibly not even launch. It may not launch because of technical issues and constraints. For example, their demo was pre-generated and not done in real time. So, there are technical hurdles they have to address, but their issues run deeper. Ultimately, I think Channel 1 will fail because of its delivery. It’s the real fake news.
When you first check out Channel 1’s demo, you are immediately taken by how lifelike the anchor’s appearance is. However, as with all of these technologies, applying even the slightest scrutiny highlights obvious issues. You then notice how the stiff, lifeless delivery is met with the inability to keep the mouth in sync. It becomes a distraction from the very point of the product. The more you watch, the more it feels… creepy.
Even though we are surrounded by fakery on a daily basis, we still overwhelmingly don’t like fake things, especially those that are supposed to seem real.
They Aren’t Max Headroom
These AI-generated human personas strive for visual perfection but forget something far more important. Visual perfection isn’t what attracts people to personas. If that were the case, cartoons wouldn’t be popular. The reality is that these companies strive for visual perfection because personality is either incredibly elusive or not possible.
Max Headroom’s jerky, glitchy presentation wasn’t something to be minimized; it was part of his persona. Of course, one thing he wasn’t short on was personality. We have all of this cutting-edge technology, yet back in the 80s, a person imitating an AI, imitating a person was still far more engaging. And, his lips were synced.
AI and The News
Will AI use cases assist news media? Perhaps, but it’s important to realize that big challenges in the current news media aren’t technological and fall more into the human and societal bucket, and prescribing tech to solve these issues hasn’t gone well in the past. I guess we’ll find out, because more is on the way in 2024.
What’s the best way to tell people you work in tech, not in healthcare, without telling them? CarePod! While people argue about when AI will destroy humanity, things like this continue to steamroll forward. Everyone seems to be having a good ole time messing around, but trust me, the find out stage isn’t far behind. It’s Friday, and I’m feeling extra spicy, so let’s dive in. 🌶️ 🌶️ 🌶️
Tech Bro Tries to Be Healthcare Bro
Okay, this CarePod article has me spun up even more than the Martin Shkreli Dr. Gupta nonsense. I think because I don’t believe anyone would take Shkreli all that seriously, CarePod is something else and has a presentation that looks more serious, kind of like putting a Ferrari body kit on your Pontiac Fiero.
No, I’m not spun up about the application of AI to healthcare scenarios. AI has a lot of potential in healthcare use cases, and we could absolutely see positive results when done properly and applied to the right use cases. This would be done in cooperation with humans and technology. There’s a sweet spot here that’s complimentary, and there are things to be hopeful about.
I’m spun up about the tech bro optimization nonsense. This mindset is absolutely in conflict with progress. Make no mistake, when these things fail, it will have larger effects on AI in healthcare as a whole. See the quote below.
“Basically, what I’m doing is slowly migrating every single thing from a doctor and nurse to hardware and software,” he said. “We don’t even believe a doctor’s office should exist. We think that it’s a thing of the past.” - Adrian Aoun
🤦♂️ This is exactly why people think tech bros are out of touch with reality. Read that again. He thinks doctors and Nurses are irrelevant and they shouldn’t exit. He thinks we are living in the year 2175 or something. Maybe then it would be true, but not with today’s technology and not fast enough to catch up with the use case they are posing. Maybe I’m the only one, but I don’t want ChatGPT to be my doctor.
Jokes aside, what happens if this system detects something serious? Who are you going to send them to? Does the system say, “Sorry bro, ya got cancer?” How do you get a second opinion? How do you get a referral to a specialist when you aren’t a real healthcare provider? Most importantly, what about when the system is wrong? The list goes on and on.
There’s a problem with turning every human problem into an optimization problem. In doing so, you lose sight of the point.
There’s a problem with turning every human problem into an optimization problem. In doing so, you lose sight of the point. Healthcare is an incredibly human and personal activity that extends far beyond providing a clinical diagnosis. Distilling these activities down into just the diagnosis part is ignorant of the field as well as the goals.
Healthcare is also filled with edge cases, the same cases that AI’s aren’t good at. It’s easy to see how a combination of humans and technology could result in better outcomes because the strengths of one address the weaknesses of the other. Not having one replace the other.
“We’re using AI to read the research, pull out the care plans, and deliver it to consumers.” - Adrian Aoun
Oh, GTFO. Let me get this straight: This guy thinks doctors and nurses are irrelevant because you can parse papers and medical texts and do some generations like creating care plans. My face hurts from facepalming so much. This is not only delusional, it’s dangerous. It’s like thinking you’re a doctor because you have WebMD. Medical conditions are a thicket of symptoms that can be the same or damn near similar to each other. Hell, even lab tests can be gray areas and have margins of error. Navigating this is much harder than self-driving cars, and we haven’t even conquered them yet.
The Spies Like Us Moment
This reminds me of the movie Spies Like Us, where Dan Aykroyd and Chevy Chase have to fake being doctors. They try to do an appendectomy by reading a medical text. After misunderstanding the meaning of “shaving the patient” and a “hallucination” by Dan Aykroyd where, after almost cutting into the patient’s chest, he claims, “I merely probing to determine muscle tone and skeletal girth.” Punctuating it with, “We mock what we don’t understand.” Again, after almost cutting into the patient in the wrong spot and being guided to the right spot by the actual doctors, the patient dies on the operating table.
We are being presented with technology that is supposed to be Star Trek and getting a reality that is more like Spies Like Us. However, even in Star Trek, doctors were still stationed in the med bay.
We are being presented with technology that is supposed to be Star Trek and getting a reality that is more like Spies Like Us.
Healthcare Has Real Problems
There is no shortage of healthcare problems, and access is certainly one of them. Healthcare costs in the United States are astronomical. Many can’t afford their medications or regular doctor or specialist visits. There’s a long list. Of course, anyone who’s ever done Teledoc knows that, at times, it can be only slightly better than ChatGPT with the ability to write prescriptions. So, I get it. There are real problems here that we need to address, but most of these aren’t really tech problems. And CarePod isn’t addressing the most important issues.
Many things, such as checking your vitals, refilling perceptions, and certain lab work, are relatively low risk and don’t require much intervention. However, there’s a monumental leap from looking at this and saying, “Doctors and nurses are irrelevant,” just because you used Teledoc to refill a prescription or swabbed your nose for a test. That’s learning the wrong lesson, but the world appears filled with automation nails when you have an AI hammer.
The world appears filled with automation nails when you have an AI hammer
Scheduling an appointment with your family care provider can be an issue, depending on your geographic location and other factors, but it’s hardly the biggest issue. This seems to be what CarePod is largely addressing. It may lower costs a bit for tests and such, but these are hardly where the expenses come from healthcare in the US. You have prescription prices, specialist visits, as well as ongoing visits for more chronic conditions that add up quickly. Remember, CarePod is outside of insurance and doesn’t address the biggest costs and issues. You can have CarePod and still go broke if you have a chronic condition.
On another note, It’s interesting how they’ve turned a real doctor into a glorified button pusher regarding prescriptions. They say the prescription is available almost immediately, so they are also trying to “optimize” this step. How much time is the doctor given to review, and will this time be tracked and targeted to get it reduced? We know how this ends, with peeing in a water bottle instead of bathroom breaks. I wouldn’t put my medical license on the line for this.
In a world filled with automation optimism and automation bias, I think healthcare is still one of those areas where people like the idea of having a human in the loop. I know, so outdated!
Perverse Incentives
I can’t help but feel there are some perverse incentives at play. They claim they aren’t selling your data, well, in the short term, that is. Neither was 23andMe until they did. This is also a startup, so when and if it gets acquired, that organization will have access to this data. In the end, this might be part of the goal. Be valuable because of your data, not your service. All the military-grade encryption in the world doesn’t address LexisNexis buying your healthcare provider.
On top of this, why add the AI? The non-ai use cases can be helpful if your goal is to provide more access to care. Even my Publix shopping center has a blood pressure cuff and a scale. People use them all the time and find them helpful, and no AI is involved. Extending some services they provide without all of the AI nonsense would be extending care to people, but I guess you couldn’t wave the AI flag to attract funding.
Nobody wants to fund making things better. People want to fund revolutions. Bunting can get you on base, but everyone wants to swing for the fences when most people will strike out. Baseball analogies aside, it’s hard to see the end goal here. They don’t take insurance (because insurance wouldn’t cover it), it’s $99 a month for a subscription, and it’s hard to determine who exactly the customer is.
“Adrian Aoun is quick to explain that he’s not a medical doctor. He’s a computer scientist specializing in AI.” - Adrian Aoun
Yeah, we got that bro. You didn’t need to tell us. Oh well, I can’t wait till they release the CarePod colonoscopy! I’m sure it will be great. Rant complete. Enjoy your weekend.
Updates
November 13, 2024 – Removed the link to the YouTube Video since it no longer works.
"Meanwhile, the technical problems mounted. Automated blood draws routinely failed. Lab test offerings were withdrawn. And patients kept getting trapped inside the CarePods."
Seems like my Spies Like Us joke was on point, unless you think physically trapping your customers in your hellbox is a great business case.
OpenAI’s recent announcement was made during their DevDay, and it was hard to avoid. At this point, I don’t think OpenAI needs a marketing department. One of these announcements was of GPTs and the GPT Store. On queue, the amateur futurists swarmed social media with bold claims and predictions, stating that this was an App Store moment just like we had for the iPhone. So, is this an App Store moment? Are the stars aligning? Are we entering a new era? Let’s take a look.
Quick Note
So, before we dig into this, I like the concept of GPTs and even the GPT Store, which may not be apparent from the content in this post. That’s because this is a post about innovation and impact. The point isn’t whether paying customers of ChatGPT will use GPTs; it’s whether GPTs will create new paying customers of ChatGPT as well as create an inevitable market that companies will need to consider as part of their strategy. This is what it would take to make an “App Store Moment” and is the primary perspective of this post. However, I will highlight a few additional issues as we go along.
My Initial Take
This post expands on my initial comment (or hot take) here where I made some claims and predictions of my own. So, to summarize from my previous comment:
They are creating additional attack surface
They are inheriting the issues of an AppStore
Influencers, not innovators, will drive use cases
Most use cases will be inconsequential
Malicious use cases will propagate
Most interesting use cases will continue to be deployed outside the GPT Store
What Are GPTs?
GPTs are a custom version of ChatGPT that you can create for a specific purpose. Some examples they give are learning board game rules or teaching your kids math. You can create these with natural language without having to do any coding. The GPT Store will allow people to share and sell these GPTs to others.
In a nutshell, it’s a fancier way of selling prompts to others with additional features, such as adding data and connecting to the Internet.
GPT Store Use and Trajectory
Influencers will drive use cases, not innovators.
The GPT Store hasn’t launched yet, but it’s clear that influencers and AI hustle bros will drive the use cases, not innovators. Influencers will rush to fill the platform with chatbots where people can ask them questions based on previous content they’ve published. Being influencers, there’s absolutely no way they’d ever try to oversell the impact of these. (Feel the virtual eye roll.) There’ll also be a healthy dose of memes because you have to keep the world spicy 🌶️
There will also be a swarm of use cases where the only goal is to be first and a majority of use cases will be largely redundant or uninteresting (in the context of innovation), providing GPTs that basically do what anyone can do with ChatGPT themselves, only repackaged and marketed as something more capable. Newsreaders, page summarizers, document summarizers, and many similar GPTs will crop up. Mostly, these will be thought of as “throw-away” use cases.
Note: I’m not saying that these use cases are useless. Some may find them helpful, but once again, we are discussing these in the context of innovation and creating a culture of paying customers.
It’s likely we will see a host of celebrity and historical figure chatbots because they are easy to create. Maybe some celebrities will release branded chatbots themselves, primarily ones that don’t recognize the reputational risk. However, still, I wonder how many “Saylor Twift” type chatbots will crop up. These bots are allowed. You only need to mark them as “Simulated” or “Parody” according to OpenAI’s policies. That’s if their creators even bother.
Even with historical figures, there’s a huge problem with distilling them down into a subsection of their writing or public appearances and pretending that they’re somehow interacting with them or getting to the heart of what they actually thought about something, but this is a philosophical topic for another blog post.
We’ll see a familiar trajectory where you have a usage spike followed by a drop-off after people have checked it out.
99 Problems and an App Store is One
By providing the GPT Store, OpenAI inherits all of the issues associated with running an App Store. These issues should include providing proactive protection to protect users from malicious GPTs. In addition, another layer should be part of this in protecting the content of creators primarily from others using their work in an unauthorized way. This protection needs to be advanced and proactive to provide even a basic level of protection. Given the initial launch and announcement, there doesn’t appear to be anything like this.
OpenAI has its acceptable use policy and will most likely count on the community for reporting. In addition, they may do some basic scanning, using a prompt to an LLM in much the same way as they did for plugins, but this is not even scratching the surface and is only a minuscule touch better than doing nothing. This won’t be maintainable if the GPT Store grows at all, and with the ease of building and deploying GPTs, this will spin out of control quickly.
Content Theft
People will undoubtedly create GPTs with other people’s content and work. This will drive less traffic to the original creator’s funnels. This is stealing other people’s work in a more direct way was done for art.
Disturbingly, some see no problem with taking a book like Outlive and creating a chatbot out of it. Even more, find no issue with taking Dr. Attia’s public content and making a chatbot out of that. There seems to be this impression that it’s fair game since he put the content online. There is something rotten to the core with this mindset, especially in cases where you are monetizing someone else’s work.
To make matters worse, GPTs and the GPT Store make it much easier to build and deploy systems that use other’s content with less friction than a more standalone solution, which is why you’ll see more content theft with GPTs vs other methods.
GPTs and the GPT Store make it much easier to build and deploy systems that use other’s content
Don’t hold your breath for a solution here. OpenAI has a mindset that they are providing the tools, and if people misuse them, that’s on them, but there is a huge gaping hole in this logic regarding content. How would anyone go about this themselves? It’s difficult to identify in all but the most egregious cases, so yes, calling your GPT the Dr. Attia Bot or the Outlive Bot would certainly raise some eyebrows, but the real harm is behind the scenes. The Live Longer Bot, completely made up of Dr. Attia’s work, would be difficult or near impossible to detect from the average content owner’s perspective.
The responsibility for detecting this type of misuse can’t be thrust onto content owners. Creators can’t police the GPT store for all of the instances of usage of their content. Only OpenAI could do something like this and accomplish it in a way with breadth to have a chance of success. The fact that OpenAI isn’t even considering a real solution to this problem should tell you all you need to know.
There is a caveat here, and that is, this is a hard problem, so I don’t mean to make it sound easy. It’s not like all you have to do is make a list and check against it as people deploy GPTs. There needs to be a thoughtful approach that considers the capabilities and tradeoffs and gives people concerned about their content some methods to check and recourses to take. But doing nothing isn’t an option either.
After all, it’s OpenAI deciding to launch a platform that allows for easy theft, deployment, and monetization of other people’s content. It should also be their responsibility to ensure they are at least taking some real steps to protect content owners and give them a process for checking if this is the case in a meaningful and effective way.
Time will tell, but there doesn’t seem to be an indication that this will happen, and it may only happen after a series of lawsuits.
How creators may change their behavior based on content theft is an interesting thought experiment. How are you supposed to promote your work if, through promotion, your work is stolen and used? It’s a conundrum, and we shouldn’t learn the wrong lessons.
Malicious GPTs
There will undoubtedly be malicious use cases. These will try and steal information and data from the user. They may even try to trick the user into installing malware. To stop this, there would need to be more robust checks in place and a process to catch these malicious GPTs before they are deployed to the GPT Store.
The popularity of this as a vector for attackers will be the popularity of the GPT Store. So, malicious GPTs will scale with this popularity and draw more attention from attackers as the attention grows.
Surprises
I do agree with OpenAI’s comment that interesting (not necessarily the most interesting) use cases will come from the community. It’s possible that creating this GPT Store opens an avenue for someone to create a meaningful app that wouldn’t have been possible otherwise. There will undoubtedly be some of these use cases, and they will be pretty cool. We should expect some surprises like this. The ultimate question, though, is, will there be enough of these use cases where it’s interesting enough for people to continue paying not only for ChatGPT Plus but also any additional fees for the GPT? It’s possible, but I wouldn’t bet on it.
Most Interesting Use Cases Remain Outside The GPT Store
The most interesting use cases of the technology will remain outside of the GPT Store and its ecosystem. This is for some reasons that are fairly obvious upon reflection. This mostly comes down to access and control. Organizations want to exercise greater control over their intellectual property and data. Conversely, open-source models are highly effective, and an organization could easily construct a more self-contained solution where none of the data has to leave its control.
It’s not just control. It’s also about the technical feasibility involved with GPTs architecture. If you have a fancy prompt, need a bit of data from the Internet, or want to chat over a document, then GPTs are fine. If you are trying to integrate LLMs into an actual solution, then the capabilities aren’t there.
Companies would also need to actively look at the GPT Store as a valid delivery source for their customers. This would only happen if this were a large, untapped market. So, only if the GPT Store is a smashing success will this force companies to consider creating GPTs on the GPT Store.
And Security… Always The Afterthought
I spend countless hours discussing LLM security, so I won’t continue beating that horse here. Let’s just say all of the current security issues still apply to GPTs, with a bit more consideration for your use case, and security will undoubtedly be a driving factor for any business use case. Just like trying to protect your system prompt, anything you put in a GPT can also be exposed.
This vector means there are confidentiality and intellectual property risks with GPTs. And if you think, oh, that’s an easy fix. It’s not, and when this one is patched, another one will be found. Consider anything you put in a GPT as being public. If you have any IP or sensitive data, it must stay out of GPTs, and you’d be better served deploying independently.
If you have any IP or sensitive data, it must stay out of GPTs
The one thing you can count on is that things will be attacked and data will be lost. These are new technologies, and we are still poking around at them. I’ve said many times these systems represent a single interface with an unlimited number of undocumented protocols, which is bad for security.
These systems represent a single interface with an unlimited number of undocumented protocols, which is bad for security
Innovation Ripeness
Major disruptions caused by innovation, such as the App Store on the iPhone, aren’t just about the tech itself or its capabilities. It’s about how ripe the area was for innovation in the first place. This ripeness combines factors such as capabilities, social trends, and timing.
For those who don’t remember, phones were things people used to talk into… not to Siri but to another human being. You’d speak into the phone’s microphone, and magically, on the other end, someone would hear your voice and want to talk. For mobile phones, you’d have a certain number of minutes you could talk on your phone plan, and text messages were extra. That is, if you ever wanted to text at all on the phone’s number pad or if you were (un)lucky T9. People even had separate devices for listening to music. How ancient!
Then, the prices came down, and more and more people started carrying mobile phones while simultaneously getting data connectivity, keyboards, and storage. People started texting more than speaking, and the transformation of the phone into both a communication and entertainment platform began.
It was in the midst of this transformation of the phone into a more central part of our lives that the App Store arrived. People wanted more and more access while being mobile on a device that was more central to their daily lives. So, the capabilities of the platform, social factors, and timing all came together. The App Store drove companies to create apps based on this demand and tap new customers on the platform.
So, will the GPT Store be the new App Store? Given these factors, it’s highly unlikely. ChatGPT isn’t a central part of most people’s lives today, and there isn’t enough evidence to think that it will be in the future. OpenAI is trying everything it can to keep users paying for ChatGPT Plus with moves such as adding Dall-E 3 to ChatGPT Plus users. I’m not sure moves like this will be enough of an incentive to keep people paying, especially when there are other options and the space is so new.
Conclusion
GPTs and the GPT Store are a neat concept and a nice addition to ChatGPT. However, it is not well thought out regarding security and content protection. This will continue to be a constant tradeoff in the years ahead. This platform makes it much easier to steal other people’s work and monetize it as your own, and I hope that OpenAI takes some steps to help content owners detect and mitigate some of these risks.
Will it become as influential as the App Store? Highly unlikely. As always, play with this stuff yourself. See the features and capabilities for yourself.