STARTUP SUCCESS

Aug 26, 2024

Scaling AI: How Fireworks AI is Transforming the Industry

Lin Qiao, Co-Founder & CEO of Fireworks AI, shares her journey from Meta to leading AI innovation. Don't miss this deep dive into startup strategies and AI trends.

In the rapidly evolving world of AI, standing out requires more than just innovation; it demands leadership and a clear vision. Enter Lin Qiao, the driving force behind Fireworks AI, who is redefining the limitations of speed, value and scale in generative AI.

With a $52M Series B secured earlier this summer, Fireworks AI has quickly become a major player after launching their public platform just one year ago. Lin, their CEO and Co-Founder, joins us to share the invaluable lessons she’s learned along her journey from leading Meta’s AI platform development team to founding an industry-disrupting startup.

We explore:

Fireworks AI’s customer obsessed culture and deep belief that “if our customers are successful, we will be successful”
How their product iterations flowed organically from their customers’ evolving needs
Why moving fast and prioritizing for speed is key to their success
Where AI is headed as it becomes more sophisticated and the industry matures

Lin offers advice for founders on everything from how to avoid the “paralysis of analysis” in decision-making to strategies for driving down the cost of AI. Don’t miss this insightful conversation with one of the leading voices in AI innovation.

This discussion with Lin Qiao of Fireworks AI comes from our show Startup Success. Browse all Burkland podcasts and subscribe to the show on Apple podcasts.

Episode Transcript

Intro 00:01
Lin, welcome to Startup Success, the podcast for startup founders and investors. Here, you’ll find stories of success from others in the trenches as they work to scale some of the fastest-growing startups in the world, stories that will help you in your own journey. Startup Success starts now.

Kate 00:19
Welcome to Startup Success. Today we have Lin Qiao in studio, who is the CEO and co-founder of Fireworks AI. Welcome, Lin.

Lin Qiao 00:29
Hey, thanks for having me, Kate.

Kate 00:31
Thanks. We’re excited to talk to you. We’ve seen Fireworks AI in the news a lot lately, so maybe to get started, if you want to tell us a little bit about your background, and then we can use that to lead into the founding of Fireworks AI.

Lin Qiao 00:50
Yeah, absolutely. I started my career as a researcher in the big data era, and I’m very passionate about creating cutting-edge technologies that drive the next wave of innovation. And later on, my research work became real products in a very big company, and I had started to transition to be a software engineer. I found myself really passionate about converting the research idea into a real impact and deliver that to the hands of end users. That motivated me to join Facebook, where they have the biggest surface area of consumers, and the impact is enormous. At Facebook, now Meta, we actually got exposed to really cutting-edge technology, and I had the fortune to be the head of a big part of the AI infrastructure team, driving PyTorch to be the dominating AI framework in the industry. But at the same time, consolidate first different AI use cases across all product surface areas to run on the platform we’re building. So that has been a truly fulfilling experience. By the time we left, we were running more than 5 trillion inferences per day across all data centers globally. So we know that like we can definitely have a great, bigger impact continuing this journey. But when I looked at the whole entire industry, they are catching up. They are going through a big transition of AI-first movement, but at the same time, they don’t have the right amount of talent, they don’t have the right software stack, they don’t have the right hardware stack. So they’re struggling and suffering while they understand if they don’t embrace AI-first, their business could be jeopardized significantly. So that kind of motivated us to start Fireworks. And our mission – at Meta it took us five years – is to enable developers and enterprises to drive AI-first creative product development in not in five years, in five weeks, or even five days. So that’s what we want to drive. And it has been a great journey so far.

Kate 03:13
Wow. I mean, it sounds like what you were doing at Meta was pretty impressive and fulfilling, right? So to take that and want to expand that for the whole ecosystem and bring it to other companies. Was that the primary motivation? I mean, that’s what I feel like I heard you say. (Definitely) Wow, that’s very commendable, especially because you were doing some really exciting stuff at Meta. So tell me how Fireworks AI. I mean, how does that work, exactly, and break it down for us. Yeah, that’s a big undertaking.

Lin Qiao 03:50
I’d love to do that. So first of all, because our goal is to make the AI infrastructure stack highly accessible to developers and enterprise. So we want to tackle the biggest challenges they face day to day. In order for us to understand the challenge of like the AI enablement has is different from pre AI. So there are multiple phases. One is pre AI, one is traditional AI, the second phase, and the third phase is Gen AI or generative AI. Each phase has different challenges. So for example, for non AI to AI, we start to introduce GPUs. The reason is, the AI models, the deep learning models, the neural network models, they are much more computational intense, and the small models can still run on CPUs, but the big models, it’s not be sufficient efficient to run on CPUs, it has to run on GPUs which have embarrassingly parallelizable execution flows. So and then how to drive the maximum amount of efficiency on GPU is a new area for many companies and developers. But that’s kind of the traditional AI world. And now we are moving into the generative AI world. What is not changing is for application developers and product developers, they still want to have a very interactive experience when they build a consumer-facing or prosumer-facing and product. It has to be super interactive and responsive to have a great product experience. And the second is, if they hit on product market fit, they need to scale. The interesting thing is, you know, they don’t product market fit, they stagnate, or they scale quickly if they go viral. So, if they’re losing money at a small scale, because GPUs are very expensive, and they’re going go bankrupt when they actually their business is going to become successful. So it’s a kind of create interesting contention here, on this technology. On one hand, it is very new and very empowering, because this technology doesn’t exist before. Now, a lot of disruptive product apps can live on top of it. But on the other hand, because it’s so computationally intense, it’s so costly, if you’re not careful, even if you have a successful product, you don’t have a successful business. So it’s very interesting, and we want to make sure our customers will be able to preserve a similar kind of interactive, responsive product experience by driving very low latency, but also will have sustainable business when that product grows by driving the cost down. So that’s kind of our initial product offering is highly performing and high cost efficiency inference platform, and that’s the kind of our foundation layer of Fireworks. But that’s not the only product we build. Next to this product, we also want to make sure, like all the our customers or users, they have their proprietary data. They want to customize their model to their business, to their use cases heavily. And that also, interestingly, gives them moat. It’s defendable, right? Because you use your data, you use your proprietary data to create your model. It’s yours, and you build on top of it. Then nobody can replicate it that easily. So we have a Fireworks adaptation engine that helps you convert your property data into your private model more so that we can automate that. Once you send traffic to us, we can automatically make your model better and better once it runs on our platform. So it can alleviate the burden developers or enterprise, they have to do it themselves. So that’s the adaptation engine. And on top of that we serve hundreds of models today. Those models go across text, the large language models, the image generation models from text to image, the audio models, audio in out or translation, the embedding models for vector search, semantic search, and the vision language model, where it can understand what’s in your image and extract information. We have a wide variety of model offerings. And on top of that, we can compose all the state of the art models into a system that can solve a complex problem. So it’s called compound AI system, the whole entire system. We actually now have a very complex product offering for our developers and enterprises.

Kate 08:54
That is incredibly robust. I had no idea it was that complex your product offering. Very impressive. You didn’t start out with just all of those in mind, right? Did that grow?

Lin Qiao 09:08
No. For this audience may be interesting to talk about how we think about the stages of our product. initially, when we started, we we saw a gap in the industry in general, especially in the PyTorch space as PyTorch is becoming a dominating framework, to make the PyTorch deployment and production set up easy. But we feel if we do that very broadly, we lose the touch of end users. So we want to then use verticals to go to, like, reach the end user and the business cases and then grow the business initially, and use that to build out the horizontal platform. So that’s why we picked Gen AI as a vertical. With Gen AI, when we initially talked with many customers – by the way, we got a lot of pull by the industry because the need is very high there – and the initial concern is latency, cost and quality. So and we take latency, latency and costs are the two sides of the same coin. It’s performance, basically, and we have operating PyTorch for years at matter where 0.1% of performance gain is a huge amount of money. So we are really good at operating at large scale to drive extreme performance optimization. So we feel kind of, hey, this is a really good starting point for us. So that’s why I focus on latency and cost, and that’s the lowest layer of our product, is our distributed inference engine that is high-performance and high-cost efficiency. And as we roll out this product, we get a lot of traction. As we roll this product, we start to see a lot of customers struggle in customizing the models using their own proprietary data. And then we feel Okay, in order for them to launch to our inference platform, we have to help them solve the quality problem. And that’s where this adaptation engine, and this engine becomes smarter and smarter. And as we onboarding more customer, we understand, oh, their workflow is not just accessing one model. Their workflow is accessing multiple models, because their business problem is a complex problem. It needs to decompose into like sliding multiple either multiple different modalities or different models together, or sometimes calling to other APIs, because one model has limited knowledge, right? It’s limited knowledge is because this training data, the training data to build this model is finite. It’s not infinite. So that’s why it has limited knowledge. And you have to access other APIs or databases or storage system or file system to get the additional extended knowledge. So we understand, oh, in order for us to kind of get our developers, our enterprise, to build our product quickly, we have to be able to compose. (Got it okay?) And that’s why we have we build the kind of the compound layer on top. So the evolution is very organic. I would say. I think we are just one of the culture or the spirit of Fireworks, we are very obsessed with customers. We are very obsessed with their challenges and how to add value to make their job easier.

Kate 12:28
I like, first of all, I love how the iterations were organic and then how you put it around you know, you’re obsessed with customers and helping them solve their problems and making it easier. Because when I talk to successful founders on this show, that’s what it all comes back to, like that customer feedback, that loop just continuing to iterate and build. Okay, so this is over a period of how many years?

Lin Qiao 12:56
So we launched our public platform in August last year. So it’s almost one year. And our company is less than two years old.

Kate 13:09
I mean, I would have guessed five with all the progress you made. It’s just you’re doing so much.

Lin Qiao 13:16
Yeah, this is a highly dynamic market.

Kate 13:19
You’ve got to move fast.

Lin Qiao 13:21
We moved really fast.

Kate 13:24
What kind of culture have you built at Fireworks AI that you’re I mean, you must have a strong culture to produce and move this quickly.

Lin Qiao 13:35
Yeah. So I think first is, as I mentioned, we are very customer-obsessed. And I think as a company, we really believe if our customer will be successful, then we will be successful. We genuinely believe that. And all our prioritization goes to how to make our customers successful. And we really care about them. And in return, they feel that deeply also, right? So they are not just our customer, they also become more of our partners, and they are vouching for us. So it just naturally grew that way. So, that is that. And the second is we prioritize for speed. So this is kind of where we – because this show is about startups, entrepreneurs – I think one of the challenges as an entrepreneur, to compare, because I have been operating a large company for a long time, the biggest difference is in a large company, especially a B2C company, the abundance is data. You have a lot of data to analyze to death. Among trade offs and options is that 1% difference or 0.1% difference. In a startup environment, especially an early-stage startup, you don’t have all the data you want to do that level of analysis. And one advice is, don’t get bogged down trying to be perfect understanding, Hey, where’s the data point for me to make the best decision? And oftentimes, you don’t. You have to kind of just be in the customer developer cohort and get a sense of how things are going. And it may cause, without perfect data, and then I would trade off like data with velocity. I’d rather move forward and launch something and test and iterate quickly and fail fast than to get stuck by the paralysis of analysis. So moving fast is kind of part of the spirit of this company. And we move really fast. So I think those are kind of the top things. And also within the company as we are growing, I think the one thing I believe deeply in the company is transparency. So we don’t really have a secret in company. We are very chatty as a company. We use Slack and have many Slack channels, but all the Slack channels are open to everyone to know what’s going on. I want to make sure we equip every single person in a company to make the right decision all the time. And the only way our decision-making velocity is high is that, as a whole company we can move really fast.

Kate 16:30
Thank you for sharing that. I think, for people tuning in, those are some important, you know, transparency, you’re right. When you come from a big company, you know, you get into that analysis paralysis, right, where you start to pull back. I think you shared some important things there. Switching gears a little bit, what is your take on AI right now, it’s the hottest sector, right? You must be excited. What are some of your thoughts about where it’s all headed?

Lin Qiao 17:04
So, I think AI is a very broad thing, I think currently, we’re operating in Gen AI. I will say I have no doubt, my thought also evolved over time – when we started the company, I’m very determined that AI is happening across the whole – it’s going to sweep across the whole entire industry. It will be a landscape-reshaping moment, and we want to be a significant influencer and the player in this transformation. And when the Gen AI wave started as a subset of AI, which is very different from traditional AI. The fundamental difference is Gen AI index our foundation model, where the knowledge has been pre-populated and absorbed by those models, and then you just build on top of it. And the Delta should be small, the things you kind of build on top of the model. And then you build the application or product on top of this easily. And the pre Gen AI is you have to train from scratch. That means you have to curate data. You have to have a team to curate it, you have to have a team to train. It’s very like capital-intensive to kind of invest in the team and all this. So, Gen AI significantly increased the velocity, right? So huge difference. But when we started, we’re at, like the beginning of the s-curve of Gen AI. And I also have the question in my mind, is that hype or not, right? It was not clear that it would just be the thing in the big AI segment. As we continue our product development on our business, the velocity of… there are a few interesting observations that hit me that are counter to my initial intuition. For example, in my mind, I was always thinking about our customer onboarding, or goal with sequence of those industry segments. We will onboard startups first, because they are the most tech forward, right. Gen AI is a new technology and they are the most brave segment, and they want to build innovative application product using this technology, so it’s Gen AI native. So startups first. And then digital native enterprises, because they are relatively tech forward, and they also have a strong engineering team to embrace new technology, and they can move fast. And last is a is traditional enterprise, right? Because they’re relatively conservative, and they usually they want to wait and to see how the industry is shaping and move forward. Now we’re working with all of those. All of them through inbound to us. So it’s a little bit crazy to work with all of them, because they do have different requirements and so on. But also, it’s a very strong signal to me that the decision-makers in those companies, whether it’s a small, medium-sized or large size company, they all recognize that Gen AI is a fundamentally important technology to invest in. Of course, there are various different use cases landed in the segments, but they are all working to empower Gen AI. And in some companies, they even bootstrap the whole entire team, hundreds of people team. So that was just kind ofa very strong signal to me that, based on my limited observation, this is real. But of course, it becomes really real – I think we’re still at early stage. I mean, we should think about S curve, right? We have definitely taken off. But we have not hit this kind of vertical. That’s because that will happen when a lot of killer apps start to show. To give you an analogy, I still remember when Facebook started video as a new product. It’s a new product experience never existed before. I remember the most interesting thing that started on this video platform is people watching people putting rubber bands, one on top of each other on a watermelon and watch when it’s gonna explode. This was about maybe seven years ago. It always happens that way. When a new technology starts, we start from simple things, but it will become more and more sophisticated. That’s how, like when mobile, when the desktop to mobile started, initially, I still remember, the very popular app was flashlight. (Laughing).

Kate 21:50
Right, yes, right, which was, like, the coolest thing, right? Yes.

Lin Qiao 21:56
And now and then later on, we got Instagram. We got Instacart, right? So I think it just, it will take time. It will compound all the kind of the experience built out, it will compound with each other and lead into those killer apps. I don’t think we’re there yet, but I think the whole entire industry is building towards that.

Kate 22:19
It’s going to be really fascinating to see where this goes. Really fascinating. We’ve spent a lot of time on AI. It was really, really interesting to hear your take and hear about all the iterations and everything Fireworks AI has done. We always, and I could talk to you for a lot longer, but we’re coming on time, so we always end the show with just general advice you have for other startup founders. You have, I encourage everybody listening, you know, Fireworks AI, you’ve had an incredible fundraising track record. You’ve done so much that you just shared with us, just, you know, something, one or two things you could share for those other founders out there on the journey.

Lin Qiao 23:02
I will say, as a founder of a startup, there is a lot of pressure on your shoulders all the time, but it sometimes feels lonely to kind of Hey, be in it by yourself. But I feel I personally benefit tremendously talking to other founders, especially at different stages, to broaden my view and the horizon of what’s possible, right? And I think being an entrepreneur is basically saying, Hey, there’s no run book, we’re creating one, right? If there’s a run book to go after, it’s not a startup. You’re creating something from scratch that doesn’t exist, whether it’s technology, whether it’s product, whether it’s business, like go to market, you’re going to innovate somewhere. And by talking to other founders, where they fail, where they’re successful, what works, what doesn’t work is tremendously helpful to me. So I feel if you have the resource to tap into that it will be great to allocate time because we’re always busy day to day. We need like a 48 hour day, but budgeting time is worth it to exchange deep knowledge with other founders.

Kate 24:23
That’s very helpful. I’ve heard that not recently, but, you know, a few times on the show, and it seems like that support, right? You said that it can get lonely and it’s a lot of pressure. Having a community around you makes a lot of sense. Thank you for kind of the deep dive into AI, where you think it’s going, and everything that Fireworks AI has accomplished. For people wanting to learn more about Fireworks AI, where should they go?

Lin Qiao 24:52
Yes, so they can go to fireworks.ai to get started.

Kate 24:56
Okay, great, great. Thank you so much for being here today. Lin, it was fascinating talking to you. I wish we had more time, but I appreciate you taking the time that you did.

Lin Qiao 25:08
It was really fun. Thanks for having me.

Intro 25:09
You’ve been listening to Startup Success to make sure you don’t miss out on future episodes. Subscribe to the show and your favorite podcast player like what you hear tap the number of stars you think the show deserves on Apple Podcasts for more tools and resources for your own startup success. Check out burklandassociates.com. Thank you so much for listening.

Fractional CFO

Tax Services

Startup Accounting

HR & Payroll

The Smarter Startup

Podcasts

Tools

Our Partners

About Us

Who We Work With

Client Reviews

Success Stories

Careers

Why Work at Burkland?

Scaling AI: How Fireworks AI is Transforming the Industry

Episode Transcript