Why Lead Enrichment in Salesforce is a Challenge (and How AI Helps)
Have you ever wanted to find out more information about your leads—like phone numbers, email addresses, company size, or any contact data? Who hasn’t!
Recent advancements in AI capabilities have created new opportunities for Salesforce data enrichment for enhancing CRM systems with autonomous agents. At GRAX, we are leveraging the power and simplicity of our CRM parquet data lake to create AI-based solutions for this type of business problem — hosted on Heroku and made accessible via Agentforce.Finding individual lead record information is a tedious task, which is why there’s already an industry segment dedicated to it—lead enrichment tools. For public information, it would be preferable to simply ask an agent “go search the web for this lead and update the record for me” directly from Salesforce.

With GRAX, Heroku, and Agentforce you can do just that. In addition, with GRAX protecting your data, you are able to safely allow AI-powered lead automation to make changes to your Salesforce instances knowing that the entire history of your data is securely backed up in the cloud and easily restored.
How to Build AI-Powered Lead Enrichment in Salesforce
Following up on our first post about an agent that can answer questions about your Salesforce data lake, we wanted to demonstrate what you can do with an agent that has more capabilities (aka “Tools”) and directed against a single business problem of augmenting lead data:
The capabilities our agent needs for this task are:
- Query Salesforce data
- Search the web
- Visit sites on the web
- Update Salesforce
By leveraging the open-source framework langgraph, we are able to enhance any LLM of our choosing with these tools — Claude, OpenAI, Deepseek, etc. — by just changing two lines of code. Our Agentforce Action is available on Github.
We are currently using an agent to analyze our lead information, craft a web search, and visit webpages to discover more firmographic data like:
- Job Title,
- Location,
- or even the spelling of their name.
This is the type of task that is well suited for an AI — it is both constrained in a clear way (conduct a search, visit websites, etc.), open-ended in an obvious one (searching the web), and leverages LLMs ability to summarize small to medium pieces of information well (by updating the single lead).

See AI-Powered Salesforce Lead Enrichment in Action
Without GRAX and Heroku, Agentforce just summarizes the lead information already in the single snapshot within Salesforce but it can’t go out to find information.
With GRAX, here’s what an interaction with our Lead Augmentation Agent Action looks like:

As you can see, our Lead Augmentation Action, which is an agent itself, does an impressive job going out and finding the information, distilling it into a proposed change, and actually performing the update. This automation turns what would normally be a process of searching LinkedIn, company websites, and news articles into a simple conversation with your Salesforce instance — letting your sales reps focus on building relationships rather than gathering basic information.
How GRAX Data Lake Enables AI-Powered CRM Automation
We were able to build this demo based on prior work from Heroku connecting to Agentforce and our initial post in this series. In that blog, we demonstrated the general-purpose nature of an Agent and its ability to write SQL based on natural language queries and some prompting. When you give that Agent access to your AWS Athena-based data lake, you can ask it all kinds of questions about your data.
This integration is possible because of GRAX’s Data Lake architecture, which maintains your Salesforce data in AWS Athena using a format optimized for analytics and AI-power automation. Rather than building complex integrations or paying for expensive third-party enrichment services, you can leverage your existing GRAX infrastructure, open-source agent frameworks like langchain and langgraph, and Heroku for hosting to power these AI capabilities.
The GRAX Data Lake‘s queryable format means AI agents can quickly access historical patterns and current data, while the backup capabilities ensure you can confidently test and refine these automations with a complete data safety net. Best of all, since you own and control the infrastructure, you can adapt the solution as AI technology evolves, switching between models or updating capabilities without being locked into a specific vendor’s offering.
Getting Started
Getting this up and running is straightforward if you’re already using GRAX. You’ll need three things:
- First, you need the GRAX Data Lake configured for your Salesforce instance. If you haven’t set this up yet, reach out to your GRAX account team—they’ll help you get your data flowing into AWS Athena where it’s ready for AI applications.
- Second, deploy our sample Lead Augmentation app to Heroku. We’ve made this as simple as possible with a “Deploy to Heroku” button in our GitHub repository. Once deployed, you’ll just need to connect it to your GRAX Data Lake by setting the
GRAX_DATALAKE_URL
environment variable – you can grab this from your GRAX dashboard.
- Finally, connect the app to Salesforce Agentforce. You can follow Heroku’s excellent and detailed tutorial for connecting Agentforce to a Heroku Application.
That’s it! You can now ask your agent to research and update leads directly from your Salesforce interface. Start with a test lead to get comfortable with how it works—you might be surprised at how much public information it can find and organize for you.

Conclusion
The ability to automatically research and update lead information is just one example of what’s possible when you combine GRAX’s Data Lake with modern AI capabilities. While there’s plenty of buzz around AI in the CRM space, we’re focused on delivering practical solutions to real business problems. By leveraging your existing GRAX infrastructure, open-source frameworks, and Heroku’s proven platform, you can build robust AI-powered lead automation that makes your sales team more efficient today.
The best part? Since your data is already organized and accessible in the GRAX Data Lake, you’re ready to tackle other business challenges as AI technology evolves. Whether it’s Salesforce data enrichment today or the next breakthrough in AI-powered CRM automation tomorrow, having your Salesforce data backed up and queryable in AWS Athena puts you in control of your AI future.
Ready to get started? Check out our sample implementation on GitHub, or reach out to GRAX to learn more about enabling the GRAX Data Lake for your Salesforce instance.
Transcript
Hi, this is Chris from GRAX!
Today, we’re gonna show you how to automate lead enrichment in Salesforce using Agentforce and the GRAX Data Lake hosted on Heroku.
This video builds on our previous demonstration where we created a custom Agentforce action with Heroku. If you haven’t seen that video, I recommend checking it out first to understand the foundation of what we’re building today.
Okay. So today, lead enrichment. It’s a tedious task, but critical for sales teams. Traditionally, you’re either manually searching the web for information about prospects, or you’re paying for a service that has already done it, potentially scraping the web themselves, or in theory, having access to private datasets. But what if we could automate this process using AI?
So in our first demo, we showed the power of allowing Frontier models to query your data lake, so you can ask it a question in natural language that normally would have required knowledge of SQL.
In this demo, we’ve enhanced our previous implementation by giving our agent new capabilities.
Our agent can now query Salesforce data, search the web for information, visit websites to gather details, and update Salesforce record records directly.
But instead of me telling you, let me show you. So here we’re in Agentforce in Salesforce, and we’re gonna ask GRAX to find more information and update the lead.
So we didn’t want to expose any customer information, so let’s see what we can find out about myself. So right now, I’m a lead. I’m a cold lead. I got DQ’d, and we have my company email address in here.
So what Agentforce has done is it’s calling out to our action that is running on Heroku. And that action itself is an agent that does all this work for us.
Wonderful, so as you can see here, we’ve updated a couple of the different fields on this user from the data lake integration, and that’s the integration user that the agent is operating as. So how did we do this?
So basically, we’ve created another Agentforce action—we call this one our lead enrichment action.
And essentially, we’ve given it more powers. So the last one had a URL connection to the GRAX’s Data Lake, and that allows the agent to query Salesforce data using SQL. It turns out that our modern LLMs are pretty expert at writing SQL. And so that’s what the Anthropic API key is for. I like using Claude—you can swap this out with OpenAI, DeepSync, anything you want. Langraph makes these two lines of code to change.
The Tavily API key, so there’s a difference between searching and browsing the web. Searching is pretty straightforward right now. Browsing required a little bit of magic with build packs to get the Heroku app working. But once again, that’s what makes Heroku powerful. I added the Playwright Buildpack and a pre build step, and we were off to the races.
One more thing I wanna show you is our app basically keeps an interaction log of everything that the agent does, so that we can kinda go in there and see what’s going on.
So here’s the query that came over from Agentforce. Here’s the response, and then here’s the conversation flow where the agent is saying, “okay, let’s see what we know about this person. Let’s do a web search.”
Okay, now that we’ve done a web search, visit the websites, and it does make some errors, so here’s where this is the beauty of the agent: It doesn’t get it right the first time, and it basically realizes it had some validation errors and just goes ahead and updates it.
So that’s the power of agents. That’s the power of Agentforce. And the power of GRAX is the data lake architecture. But basically maintaining your Salesforce data in AWS Athena, in a format optimized for analytics and AI—Parquet—it’s the foundation that enables these intelligent work flows.
Also, because GRAX is backing up the entire Salesforce history, I feel comfortable allowing this agent to make changes in my Salesforce instance because we can always roll our data back with GRAX.
So that’s really it. We wanted to show this quick demo, giving concrete business applications of Agentforce and GRAX Data Lake hosted on Heroku.
The full implementation is available on our GitHub repo. And as always, just feel free to reach out to our GRAX account team if you have any questions about enabling this for your org. Thank you for watching!