{ "title": "Less Control, More Tools: How Genspark Built a Super Agent That Scales", "titleTextColor": "#000", "date": "Jul 29, 2025", "authorAvatar": "https://gensparkpublicblob-cdn-e6g4btgjavb5a7gh.z03.azurefd.net/user-upload-image/manual/gen_avatar.png", "author": "Kay, Co-founder and CTO of Genspark" }

Less Control, More Tools: How Genspark Built a Super Agent That Scales

Our Co-founder & CTO Kay Zhu participated in the VentureBeat Transform 2025 conference on June 25, delivering a presentation on Genspark's innovative approach to building AI agents. The presentation received exceptional feedback from attendees, with many requesting access to the full content and slides for further study and discussion.

In response to this overwhelming interest, we are pleased to share the complete presentation content and supporting materials with our community. This comprehensive overview expands on the original 12-slide presentation, providing the technical depth and implementation insights that time constraints prevented during the live session.

We believe this material will be valuable for professionals working with AI agents or developing AI strategies, and we encourage mutual learning and exchange of ideas within our community.

Less Control, More Tools

Live Demo Session

Presentation begins with live demo

Kay opened the presentation with an extensive live demonstration of Genspark's Super Agent capabilities. First, Genspark researched VB Transform 2025 speakers in real-time and drafted presentation slides. When prompted with "Who's the most famous?" Genspark identified Andrew Ng as the most prominent speaker.

In a particularly engaging moment, Kay utilized the "Call for Me" feature to make a live call to organizer Matt at the venue, half-jokingly requesting to move the presentation slot before Andrew Ng's session, demonstrating the agent's ability to perform real-world actions.

The demonstration continued with several examples:

Transformation of a Nikola Tesla introduction into an interactive science fiction website
Jupyter-backed analysis of a marketing dataset using pandas and matplotlib, including correlation studies and executive-ready narrative
Gmail and Google Drive integration to compile PM applicant information, save résumés to AI Drive, and draft personalized responses
The presentation deck itself, created using "vibe-PPT" functionality within Genspark, including image generation and style modifications

Keynote Session

The following is a transcript of Kay's presentation:

Less Control, More Tools

We position Genspark as an all-in-one AI workspace with a simple vision: bring the cursor-like experience that developers love to everyone. We want everyone to experience "vibe working"—vibe PPTing, vibe Exceling, vibe researching, vibe emailing, even vibe calling. In practice, that means you describe what you want, and our Super Agent plans the steps, picks the right tools, and produces the asset—slides, docs, code, webpages, analyses, or even phone calls—without forcing you into rigid flows.

This isn't theoretical. The same Super Agent that builds slides can also do deep research on an event and draft the deck, digest a six-hour video (like a shareholder Q&A) into a clean brief, turn a simple prompt about Nikola Tesla into an interactive website your kid will love, run Jupyter-powered analysis on a messy marketing spreadsheet (with charts, correlations, and a narrative), or connect to Gmail to compile applicants, download résumés to AI Drive, and draft personalized responses. These are everyday use cases we and our users run—fast, reliably, and end-to-end inside one workspace.

We launched our Super Agent in April and the response was incredible. We hit $10 million ARR in just 9 days and $36 million ARR in 45 days. In 10 weeks, we launched 8 products: AI Browser, AI Secretary, AI Personal Calls ("Call for Me"), AI Download Agent, AI Drive, AI Sheets, AI Slides, and our Super Agent. You can try all of these at genspark.ai.

Genspark's Core Design Philosophy: "Less Control, More Tools"

Let me share our design principles for building super agents. We call it "Less Control, More Tools."

In Genspark, we believe you need to let agents plan—don't hardwire flows. And you need to arm agents with sharp tools, not rigid rules. When you give rigid rules to agents, you actually lower their capability ceiling significantly.

Less Control, More Tools Let's talk about what "less control" really means. In Genspark, we've discarded all hardwired workflow infrastructure and fully embraced the agentic engine.

The fundamental difference between workflows and agents is how they handle the unexpected. Workflows are predefined steps that break on edge cases. No matter how many exceptions you code for, reality will always surprise you. When a workflow hits an unanticipated situation, it either fails or continues down a suboptimal path. Errors accumulate—each small failure compounds downstream.

Agents operate differently. At its core, an agent is an LLM in a while loop: plan, execute, observe, and crucially—backtrack. When an agent encounters the unexpected, it doesn't fail or proceed blindly. It observes the current state and decides the optimal next action based on what's actually happening, not what was supposed to happen. If something goes wrong, it finds alternative solutions, tries different tools, or completely reimagines its strategy.

This is the key insight: In workflows, errors accumulate. In agents, errors are recoverable.
Think of it this way: a workflow is like a train on tracks—efficient when everything goes as planned, catastrophic when it derails. An agent is like a skilled driver with GPS—when the planned route is blocked, they find another way to the destination.

While an agent sounds simple—just an LLM in a loop—building a good agentic engine requires sophisticated design in two areas: model orchestration and the improvement loop. Let me show you how Genspark's agentic engine maximizes these capabilities.

Less Control, More Tools Our agentic engine embodies the two critical areas I just mentioned: model orchestration and the improvement loop.

Model Orchestration: Making Everything Work

We've made a deliberate architectural choice: our engine combines nine different-sized LLMs in a mixture-of-agents approach, alongside 80+ specialized tools and 10+ premium datasets. We believe the future is multi-model, not "one model to rule them all."

This belief comes from hard-won experience. As strategic partners of OpenAI, Anthropic, and other leading AI companies, we've extensively tested every major model in production. The reality is clear: no single model excels at everything. OpenAI shines at deep research and creative writing. Claude demonstrates superior agentic reasoning and complex coding. Gemini outperforms in multimodal understanding—analyzing images, videos, and complex visual data.

The "one model to rule them all" approach sounds appealing in theory, but it's like asking a Formula 1 car to also be the best truck, SUV, and family sedan. Our orchestration leverages each model's unique strengths: creative tasks go to OpenAI, complex reasoning to Claude, visual analysis to Gemini. By intelligently routing queries to the model best suited for each task, we deliver experiences that feel like genuine AGI. This multi-model orchestration gives users the best possible experience—not compromises, but the best of every world.

The Improvement Loop: Making Everything Better

Here's where the magic compounds. We use LLMs as judges to evaluate entire agent sessions, generating overall rewards. But we go further—we attribute these rewards to each individual step in the session. This granular feedback reveals exactly which tool choices work best in which situations.

Every day, as users interact with Genspark, we accumulate hundreds of thousands of real-world examples of successful tool selection and model routing. This experience is internalized through reinforcement learning and prompt playbooks, continuously improving our agentic engine. The beautiful part? Even if I gave our entire engineering team a vacation, Genspark would keep getting smarter, learning from every user interaction to serve future requests better.

This multi-model, continuously learning approach isn't a temporary workaround—it's the future of AI systems.

Less Control, More Tools I was told that many attendees at VentureBeat come from enterprise backgrounds. While Genspark is primarily a consumer product, I'd like to humbly offer some thoughts on how these principles might apply in enterprise contexts.

I've discussed this extensively with friends in enterprise startups, and the reality is clear: 80%+ of current enterprise workflows work very well. They're predefined, thoroughly tested, fast, and reliable. There's no need to fix what isn't broken.

But here's what my nearly 20 years in search has taught me. Today's Google delivers far better experiences than they did 10 years ago, yet their overall query satisfaction rates remain around 75%. Why? Because when users discover your product is intelligent, they ask harder questions. When they hit limits, they retreat to simpler queries. It's a constant dance between user ambition and system capability.
This same phenomenon applies to AI agents. Even if 80% of work can be handled by rigid workflows, I suggest routing the hardest 20%—the cases where workflows fail—to agents. Remember what I said earlier: workflows accumulate errors, but agents recover from them. Those edge cases that break your workflows? That's exactly where agents shine.

My suggestion for enterprises: Keep your workflows for the 80% they handle well, but let agents tackle the failures. Then watch as that 20% grows steadily larger, because in the AI era, user expectations only go up.

Less Control, More Tools Let me explain "more tools." We believe tools have network effects. As a software engineer, I'm very familiar with Linux terminals. What I love about Linux is the magic of the pipe—you can connect different commands together, where the first command's output becomes the second command's input. Linux can do amazing things just in the terminal.

Today's Super Agent engine is like an intelligent pipe—it can smartly connect all kinds of tools and deliver the final result beautifully.

Some of my friends are purists who believe you should only give agents a keyboard, mouse, and computer—nothing else is necessary. They argue that since AI can browse, code, and theoretically write tools on the fly, why invest in pre-built specialized tools? This fundamentalist view misses how professional excellence actually works.

Here's why they miss the mark. Imagine two engineers with the same skills and intelligence: one with a blank Mac, another with a Mac preloaded with all work-critical apps, specialized development environments, testing frameworks, and domain-specific tools. Who ships faster? Obviously the latter. More tools don't limit capabilities—they multiply them.

In Genspark, besides terminal, browser, and computer access, we give agents 80+ additional specialized tools: Python interpreters, slide makers, Crunchbase API, video generators, image generators, database connectors, analytics platforms, and more. Here's the key insight: each new tool doesn't just add one capability—it creates exponential combinations with every existing tool. With 80 tools, you're not looking at 80 capabilities, you're looking at tens of thousands of potential combinations. Tool A's output can feed into Tool B, which combines with Tool C—the possibilities explode.

The smartest professionals understand this. They invest heavily in curating the best pre-built tools for their domain. When a new task arrives, they immediately reach for the optimal tool. While they can create solutions from scratch when necessary, they know that specialized tools deliver better, faster, more reliable results.

Our agents work the same way. The magic happens in the intelligent orchestration—our agent knows when to use which tool, how to chain them together, and when to create custom solutions. It's not just a pipe anymore—it's an intelligent network where every tool amplifies the capabilities of every other tool.

Less Control, More Tools Here's a crucial insight for enterprise teams: whether you choose workflows or agents as your technical solution, what impacts your results most is actually your toolset's quality and coverage. This is often overlooked, but it's the foundation of everything.

I encourage you to audit your current tool stack with fresh eyes. Ask yourself: does it enable network effects, where each tool amplifies the value of others? Or are your tools isolated silos that lead to dead ends? Many enterprises have accumulated tools over years without considering how they work together—or don't.

When designing your toolset, the most important consideration is composability. Each tool should expose clean, well-documented APIs. Think of it like building with LEGO blocks—every piece should connect seamlessly with every other piece. This allows your system to automatically select and combine tools in ways you might never have anticipated.

The companies that get this right don't just have tools—they have a tool ecosystem. Each new addition multiplies the capabilities of the entire system. Investing in composable, high-quality tools is what will ultimately determine your success in the AI era.

Building AI-Native Teams

Let me share how to build an AI-native team. Today, speed is everything. An AI-native team moves the fastest.
Less Control, More Tools

At Genspark, we started with just 20 people (now 24), but everyone "vibe codes"—our PMs code, our designers code, even our CEO codes. This isn't just about democratizing coding; it's about fundamentally changing how we think about building products. More than 80% of our code is written by AI tools like Cursor and Claude Code.

The term "vibe coding" captures something profound about this new way of working. When you're vibe coding with Cursor or Claude Code, you're having a natural conversation with AI about what you want to build. You describe your vision, and the AI helps manifest it into actual code. It's like having a brilliant senior engineer pair programming with you 24/7—never tired, always ready to help, constantly learning from the latest patterns and best practices.

We've found that people with an architect mindset—those who can think in systems and understand the big picture—thrive in this environment. They focus on "taste": understanding what to build, why to build it, and ensuring every decision has clear intentionality. The syntax, the implementation details, the boilerplate—AI handles all of that. What remains uniquely human is the judgment, the product sense, the understanding of user needs.

But here's what many people miss: "vibe coding" doesn't mean lowering our standards. We maintain extremely rigorous code reviews precisely because so much code is being generated. Every single line—whether written by human or AI—goes through the same strict review process. Senior engineers review for architecture, security, performance, and maintainability. This is absolutely critical for keeping our codebase healthy and sustainable. The combination of AI acceleration and human quality control enables both speed and sustainability. We ship fast, but we ship right.

The results speak for themselves: one PM who joined us to work on a product started vibe coding with Cursor within two weeks and is now the major contributor to AI Slides. Our AI Browser was built by a single engineer in less than 3 months. With just 20 people (now 24), we shipped 8 AI products in 10 weeks. Since then, we've continued this momentum, shipping AI Docs, AI Pods (which generates podcasts), and expanding Call for Me globally.

Less Control, More Tools In the AI era, small elite crews with full autonomy out-ship giants. When everyone on your team can leverage AI to be 10x more productive, a team of 20 can achieve what traditionally required 200 or more people.

From my experience at Google and Baidu, where I managed teams of over 1,000 people, I've learned that traditional scaling creates inherent inefficiencies. Large organizations carry massive overhead: communication complexity grows quadratically, coordination costs compound, decision-making slows to a crawl. But now at Genspark, with our AI-native approach, we're moving faster than ever before.
The key is that everyone must fully embrace AI tools. At Genspark, we don't just build AI products—we eat our own dog food. Our team uses Genspark internally for everything: deep research, data analysis, marketing studies, UI prototyping, generating marketing content, etc. We also embrace the latest external AI tools, investing in premium access to Cursor, Claude Code, and other cutting-edge solutions. This creates a powerful feedback loop—we build better products because we're power users ourselves, and we become more productive by using what we build.

I've heard that some large organizations block access to the latest AI tools, fearing security risks or loss of control. This approach is fundamentally against future trends. As AI evolves exponentially, these organizations aren't just falling behind—they're becoming irrelevant. Future competitiveness will directly correlate with how openly and aggressively companies embrace AI.

If you're in a large organization, you have two options. First, form an internal strike team—your own "Ocean's Eleven" of AI-native builders. Select people passionate about the future and willing to take risks. Give them full autonomy, the best tools, and meaningful incentives. Start from one seed and grow into a small task force that can eventually replace traditional structures. Second option: partner with AI-native startups who already live this way.

Building an AI-native team is like building an RL algorithm—keep executing, learning, and improving. Every time we use AI tools (both external and our own Genspark) to accelerate our work, we get better at directing them. Every project completed faster teaches us new patterns. The team's collective intelligence compounds over time, creating a virtuous cycle of productivity and innovation.

The window for building AI-native competitive advantages is open now, but it won't stay open forever. The future belongs to small, elite teams that fully embrace AI while maintaining the discipline to build sustainable, high-quality systems.

Summary

Less Control, More Tools

To wrap up, the message is clear: "Less Control, More Tools" isn't just a technical philosophy—it's a fundamental rethinking of how we build intelligent systems. Small AI-native teams represent the fastest path forward in this new era. This presentation itself was created using our Super Agent, proving that we practice what we preach.

The future belongs to those who can embrace intelligent autonomy, build composable tool ecosystems, and trust small teams to move fast. At Genspark, we're not just building products—we're pioneering a new way of working where AI and humans collaborate as true partners.

Kay's insights on AI agent architecture have resonated across multiple industry events. For readers interested in exploring these concepts further, we've compiled key resources that expand on the ideas presented at VentureBeat Transform.

Related Resources:

Presentation Materials: Full slides of this presentation
Extended Discussion: Complete presentation video at EntreConnect
Media Coverage: VentureBeat article | EntreConnect deep dive

Less Control, More Tools: How Genspark Built a Super Agent That Scales

Live Demo Session

Keynote Session

Genspark's Core Design Philosophy: "Less Control, More Tools"

Model Orchestration: Making Everything Work

The Improvement Loop: Making Everything Better

Building AI-Native Teams

Summary

Genspark Worldwide