There's consensus emerging among VCs that #LLMs will make software companies look more like services businesses because software will own a task end-to-end instead of merely enabling a human to work on a task faster. I've seen this from partners at NFX, Foundation Capital, and OpenView. The shorthand they like for this "LLMs enable service as a software." I'm betting the company (ATLAS) that they're wrong. I'm trying to work with GTM/PLG teams instead of replacing them, not because I'm particularly precious about displacing jobs but because LLMs just aren't good enough to do their jobs and that won't change anytime soon. LLM optimists point to LLM's effectiveness in domains like coding, prospecting, and SEO content gen, but the wins here are limited and will be short-lived. Coding is unlike any other domain in that the content (code) generated by LLMs can often be unambiguously verified as correct. The code either runs or it doesn't. It's a mistake to generalize from the coding domain. The effectiveness of LLM-powered SEO content gen and prospecting, moreover, is primarily a function of the competitiveness of the space in which these solutions operate. Once the competition shows up, no one is going to care about 80th percentile performance on cold outreach or SEO content. Humans will be back in the game. E.g., Dave Rigotti banned ChatGPT for content gen at Inflection.io this week. I already use "better than ChatGPT" as a filter for content I want to write. LLM optimists often think that there's a clear path to making these models much better. There isn't. Pay less attention to people who have a financial interest in having you believe otherwise. Don't take my word for it either. I'm just incentivized in the opposite direction. Gary Marcus is too, but he's worth paying attention to.
K. Matthew Dupree’s Post
More Relevant Posts
-
Finished Part I of "Situational Awareness: The Decade Ahead." It was written by an x-OpenAI employee and claims that we're likely to achieve AGI by 2027. Previously, I noted how annoying it was that the author trivialized the issue by noting that his prediction was merely a matter of "believing in straight lines." But, I thought this was an interesting point: given that we've been consistently surprised at how effective scaling up compute and data has been for #LLMs, the burden of proof is on skeptics of this approach to show why it won't pan out. In other words, he's arguing for a presumption of optimism. The justification for presumptions like this are often as pragmatic as they are epistemic. Consider the presumption of innocence for example. I once heard a would-be juror and former law enforcement employee get dismissed because their prior experience led them to believe that "If someone is in this courtroom, they very probably did it." The judge made the right call. It may be true that if someone is being tried, they're more likely guilty than not, but this misses the pragmatic side of the presumption of innocence: on average, it is worse to lock people up who are innocent than to let the guilty go free. Presuming innocence operationalizes this value judgment. What are the pragmatic considerations that inform whether a presumption of optimism around AGI is reasonable? It depends on who you are, of course. If you're a venture capitalist or a SaaS founder, presuming optimism is more reasonable than if you're a baker. Being wrong on AGI is asymmetrically costly for the founder and VC, but not for the baker. I've talked a lot of trash about LLMs here over the past few months, but if this line of reasoning is correct, then I need a more serious argument for why scale won't work.
To view or add a comment, sign in
-
-
Has anyone run a study that looks at perceived #LLM capabilities and prospects for improvement as a function of a person's empathic abilities/tendencies? We didn't evolve to distinguish between real and artificial minds, so all of us are prone to over-estimate LLM capabilities (the Eliza effect), but --- given that empathy requires a strong sense of another mind --- I wonder if empathic people experience this distortion more strongly.
To view or add a comment, sign in
-
Really love Fathom - AI Meeting Assistant's "percentage talking" and monologue detection features. Being a CEO is a great way to start believing too much of your own bullshit, and this feature helps me talk less and listen more.
To view or add a comment, sign in
-
-
It's the 4th, so here's my political post for the year: I feel really luck to live in the US, but the way the agendas/platforms of both parties has progressed over the last few years has me pretty scared about our future prospects. Really hope we can get our shit together.
To view or add a comment, sign in
-
This heuristic is overrated: "What the smartest people do on the weekend is what everyone else will do during the week in ten years" Chris Dixon, Partner at a16z, said this 10+ years ago and I've seen versions of it elsewhere. It's always justified by pointing out that the personal computer and the internet were created by nerds tinkering on weekends. The nerds of the 80s and 90s were luckier than the above heuristic acknowledges. They happened to be tinkering with things that were extremely useful for the rest of us, but I've spent the last 10 years working with nerds who tinker with stuff that will have far less impact than the internet or personal computer. Whether innovative work happens on a "weekend" or could be classified as "tinkering" by "nerds" doesn't matter as much as whether the work is done by people working for a long time on a problem that is acute for a high-performing population and chronic for the rest of us. 2 examples: 1. Peter Attia has pointed out that a lot of the tech we find in our cars starts out in F1 racing vehicles and that there's a similar dynamic around health tech. A lot of it is born to help elite athletes perform better and then winds up trickling down to the rest of us. A great example of this is the way in which studying world-class cyclist performance has led to insights about diabetes and metabolic health more generally. (See his book Outlive for more.) 2. The leaders of Bell labs noted that WWII drastically accelerated the pace of innovation since people urgently felt the need to solve concrete problems like making radio transmissions more reliable for soldiers. This is the opposite of weekend tinkering, and again, we see that deliberate effort towards solving the communication problem felt acutely by a high-performing military paid dividends for the rest of us who felt communication pain more chronically. (See The Idea Factory for more.) If we did a systematic study of innovation, we'd find far more of it fell on the side of deliberate, 9-5+ work on problems that are acute for a high-performing segment and chronic for everyone else than haphazard weekend tinkering. I might be wrong about that, but insofar as founders are in the business of innovation, they need to look beyond recent, cherry picked examples to find the patterns that reliably produce useful new ideas. Looking at 2-4 examples from the last 3 decades is weak sauce.
To view or add a comment, sign in
-
-
Started reading "Situational Awareness: The Decade Ahead" by an OpenAI employee claiming that we'll see another GPT2 to GPT4 sized gain in #LLM capabilities by 2027. This sentence is infuriating: "it is strikingly plausible that by 2027, models will be able to do the work of an AI researcher/engineer. That doesn’t require believing in sci-fi; it just requires believing in straight lines on a graph." Sometimes believing in straight lines and believing in sci-fi is the same thing! Here an example that really shouldn't be that hard to remember given that we're all in computing: Moore's law. The law is petering out and if you "believe in straight lines," you'll believe that processors will have 400 billion transistors by 2030. Not even Moore thinks this will happen. It's sci-fi. The debate about LLM capabilities is oddly emotional and its leading smart people to reduce very complicated issues into cute phrases like this. The key, non-trivial question as always is whether the trend line will continue. For all the hype around this doc, I suspect the author will make a good case for this, but trivializing the issue by saying its just a matter of "believing in straight lines" doesn't inspire much confidence.
To view or add a comment, sign in
-
There's a "fog of war" for startups just like there is in our favorite strategy games. Over the years, Bocar Dia, Timothy Chen, and Dana Oshiro have helped me see this. Bocar once told me in passing that it's hard to understand a company's long-term vision from their site, which is why its important for investors to actually talk to founders. He's right. You can't really know whether a startup's future roadmap will make your product irrelevant. Tim once shared with me how Looker's GTM is really what helped them win. It's impossible to deeply know about that GTM by looking at product features. You'd have to actually go through their sales process to deeply understand how to compete with them. Dana once casually mentioned that some startups are bullshitting. She doesn't get hung up on slick marketing like I have in the past. Makes sense. Investors gotta be bullshit detectors. This fog of war applies a little less when you're looking at larger companies, but it's still there!
To view or add a comment, sign in
-
-
People often ask me if I have any fundraising advice. My silly answer is always: "Raise for a gen AI startup in summer 2023." I'll take a small amount of credit for having the sense to move up our initial fund raising timeline to catch the initial wave of LLM enthusiasm, but the main point of my joke is that business success if far more susceptible to luck than many people think. Remembering this is a good way to manage my ego in both good times and bad.
To view or add a comment, sign in
-
"STFU and sell 1000 bars." That's what Peter Rahul's dad told him when he was first starting RXBAR. I wish this advice loomed larger in my mind when I first started my founder journey 2 years ago. It's easy to get lost thinking about technical implementations, competitive analyses, and detailed product roadmaps. Jonah Midanik and Timothy Chen have both helped me move closer to the "STFU" mentality above. Jonah runs Forum Ventures' studio and once said, "we don't write any code until we've shown someone a Figma mockup and they say 'I will pay you for that.'" Tim shared how he's seen founders incorporate feedback into multiple product iterations and the users still never buy the damn product. "You're better off showing them an updated mockup," he once told me. I've read Lean Startup. I should have known better than to waste time on features/products no one wants, and yet, it takes short and concrete advice like "STFU and sell" to really get me to change my behavior. Happy to say that I've successfully sold deals off of mockups now. It works, even when you tell people they're mockups. If you have a credible promise of solving a problem for someone, you don't have to lie about how mature your solution is.
To view or add a comment, sign in
-
The New York Times published an article 2 days ago on how #LLMs have led to a boom of consultants who are helping companies figure out how to apply the new tech to their business. BCG, for example, grew their AI revenue from 0 to 20% in 2 years. Lots of money will be wasted here. I'm already seeing examples from my network. What if there are no compelling applications for a business? Can a consultant just state this plainly after charging? Nope. They have to come up with something reasonable sounding to justify their fee, tee up the sale of additional services, etc. Because many people over-estimate the capabilities of #LLMs, they'll sign the expansion contract or they'll build something internally that turns out to be a dead end. 3 recommendations to avoid this outcome. 1. Don't assume the tech is good enough to have any application to your business. The demos are compelling. The real world performance is much less so. 2. Systematically evaluate GPT4's capabilities across a set of a few dozen examples relevant to your use case. Stick the results in Excel. Compute some metrics. It's not that hard, and yet, amazingly, many companies are engaged in some oddly "vibe-based" evaluations of LLM performance. Because LLMs are so charming and human-like, these informal evaluations will lead to over-estimates of its capabilities. (Google "the eliza effect" if you want more on this.) 3. Once you understand baseline GPT-4 performance, ask your consultants to share the distribution of performance gains they're seeing over and above stock GPT-4 performance via their services like fine-tuning, RAG, etc. If they can't give you any numbers here, run. A case study is not good enough because the outcomes here are highly variable. You want to build a confidence interval around the expected performance gains and determine if the lower-bound of that interval will lead to high enough perf to justify the investment.
To view or add a comment, sign in
Cofounder Inflection.io & founder PLGTM.com
1moTo be clear I said banning ChatGPT and did not ban GPT! We use GPT for content creation a lot.