Lessons Learned with LangChain

This blog post provides a comprehensive overview of LangChain, a powerful framework for integrating Large Language Models (LLMs) into software applications. It delves into the key benefits of using LangChain, such as abstraction, orchestration, and observability. While LangChain offers numerous advantages, it also has limitations and potential challenges. The post discusses these drawbacks and provides insights into alternative frameworks for LLM development. Overall, it aims to help developers make informed decisions about whether LangChain is the right choice for their specific needs.

Introduction

Generative AI with Large Language Models (LLMs) have finally gained traction as a useful tool in software applications, almost universally.  However, to effectively integrate LLMs into an application, I believe that you should avoid developing frameworks internally - doing so will most likely limit your ability to diversify your LLM accessibility and portability.  LangChain as a framework does an excellent job of abstracting away many of the minor details required for integrating many different LLMs.

Available LLM Providers

Although OpenAI is currently the dominant provider of LLMs and Generative AI services for software applications, it is not the only option.  Thank goodness!

OpenAI has many advantages with its API services.  First, its model responses or outputs are pretty good and reliable (I’m not going to scientifically quantify that yet - lots of nuance here).  Second, it was the first to offer features like json mode and function calling.  Great features!  And perhaps most importantly, it was the first to market.

You may in fact do very well to build your application tied directly to OpenAI’s API services, and you could call it a day there.

But there are other options.  Anthropic’s Claude, for example, delivers comparable responses from less expensive API calls.  Anthropic offers similar features as Open AI as well.  You also have Google’s Gemini, Amazon’s Titan, and a large host of open source LLMs available such as Meta’s Llama 3 and Mistral models.

You may even find domain focused models, such as Kinetica’s specialized Language-to-SQL model.

An entire universe of LLMs are available to use right now, and they all have their own specialties and advantages.

And finally, you must consider the hosting options for all these models.  Are you calling out to OpenAI services directly?  You probably need to protect the content your users submit; your prompts and completions can potentially all contain sensitive information.  Azure and AWS offer LLM data protections, others are available as well.

LangChain Provides Abstraction

How are you going to evaluate and use these models for your task without an abstraction layer?

By offering a unified ChatModel interface, all of these models can be hot swapped at any moment.  Now we can prototype a new prompt with the most powerful Gemini model available, then scale down to Claude Haiku to save both time and money when we start to get consistently usable responses.  I can not understate how powerful this feature is for development.

And speaking of prompts, LangChain offers a fantastic abstraction layer over prompting as well.  With each LLM offering differing levels of support for roles in their input (i.e., human, system, assistant) which can have a material effect on the result, LangChain handles all of the inner mechanics of those roles for you.  If this is a special JSON payload for the API endpoint, that’s done.  If it’s a special text format, that’s likely provided by the LangChain community.

Finally, if you need to redeploy on Azure, AWS, or on-prem, LangChain makes those migrations as painless as possible.

LangChain is more than Abstraction

If LangChain simply provided an abstraction layer to LLM invocations - that could be enough, actually.  

Fortunately, LangChain does more.

Orchestration

LangChain also provides support for orchestration via DAGs (Directed Acyclic Graphs), allowing workflows for agentic behavior, routing, error handling with retries, and parallel calls to LLMs or compute.  This allows good software practices of isolating business logic within “chains” which can be composed into much more complex and higher level graphs.

Subchains can be tested and checked against regressions, a classic problem in software engineering which I’ve found to be even more of a nuisance in fast-changing environments, such as the one we find ourselves in today.

Furthermore, orchestration is simplified through the “LCEL” (LangChain expression language), which allows writing linear pipelines of functions quite simple and easy to read, like:

 
chain = (
  prompt_formatter |
  chat_model |
  output_parser
)

response = chain.invoke(user_input) 

These chains are fairly easy to write and comprehend.

Observability

Furthermore, LangChain provides out-of-the-box hooks for observability.   This allows our engineers to evaluate big picture performance and identify bottlenecks and loopbacks.  How many tokens are used by an invocation?  Which step takes the most time?  With LangChain, these questions are much easier to answer.

LangChain favors LangSmith as a tool for leveraging the observation capabilities, which by itself is an excellent tool.  Since LangSmith is not open source, you may also consider other options available, such as Langfuse or OpenTelemetry, which integrate with your application fairly easily.  We have found that even simple hand-rolled logging callbacks have added value to the applications we have built in LangChain by making tracing and debugging at-a-glance easy.

We have been able to use these tools to isolate and minimize runtime and costs as well as perform cost/benefit analysis on the usage of techniques.  For example, does utilizing something like a ReAct agent justify potential extra token costs vs a more linear problem solving approach? 

Bootstraps

LangChain also comes with functional implementations for tools and even agents, such as a ReAct agent mentioned above.  These implementations do a great job of abstracting those common tasks in a straightforward and reliable manner.  With out of the box implementations, we can quickly prototype different problem solving approaches without adding a mountain of technical debt and maintenance to our product offerings.

LangChain Is Not Perfect

What’s going on here?

Abstraction comes at a price - we have found some situations where the abstraction of how messages, prompts, and functions get ultimately converted to LLM prompts may not in fact work well in every instance.  This can lead to needing to develop custom overrides and alterations to deeply nested LangChain code.

In a similar vein, we’ve had difficulty in many instances debugging and unraveling stack traces to understand errors when they occur.  The orchestration layer also adds an effective obfuscation layer to understanding the full chain of events to an error with extraneous consequences.  In these instances, implementing strong telemetry and ways to inspect your full flow are immensely valuable.

Complex Chains are… less fun

Writing linear chains with LCEL are fun, as mentioned above, but they become a little less elegant when writing something more complex, like a DAG with parallel paths:

 
chain = (
  {
    "left_output": RunnableLambda(x: x["left_input"]),
    "right_output": RunnableLambda(x: x["right_input"])
  } |
  PromptTemplate.from_template("using {left_output} and {right_output}, generate useful text") |
  chat_model
)

chain.invoke({"left_input": X, "right_input": Y})

Already in this relatively simple graph, we are quickly moving out of our comfort zone for whoever has to review this pull request.  Instead of a very simple a->b->c situation, we are now charged with reading a web of dictionary closures {} and RunnableLambda instances with hard-coded keywords.  We’ve found the price to pay worthwhile, but a cost nonetheless.  We’ve found the best way to minimize these costs is to isolate these branching paths as much as possible, then incorporate them back into the larger flow as a single linear unit.

Graph State

Lastly, LangChain has not yet implemented a top level “state” for use within interconnected subchains.  This leads to awkward workarounds involving “pass through” nodes or abusing the internal LangChain config object.

Consider a RAG scenario involving a SQL query with a large data payload.  This payload may ultimately build a visualization or color a map or feed a data pipeline, but we may also slice and sample the payload for generating summaries or assuring the quality of generated responses.  Passing through this payload and accessing it within functions has proven to be a difficult task with default LangChain, requiring custom implementations.

An implementation for this might look something like this in vanilla LangChain:

 
summary_chain = (
  slice_data |
  summary_prompt |
  chat_model
)

chain = (
  generate_sql_chain |
  execute_sql_chain |
  RunnableLambda(x: {"raw_data": x, "summary": summary_chain})
)

response = chain.invoke("What were the top 5 investments by growth in 2022?")
data = response["raw_data"]
summary = response["summary"]

The maintainers of LangChain have addressed this issue to some degree in their sibling offering LangGraph with StateGraphs.  I would say this functionality provides a better alternative to the above issues, but I’m still not entirely satisfied with its methods for maintaining state, especially in parallel executing branches.

Growing Pains

Bringing up LangGraph also leads me to my final point on “LangChain is not perfect.”  With the ever-growing market of LLM functionality, active research into generative AI and a general development of industry best practices, LangChain suffers from a bloat of features that have grown organically over time with the onslaught of AI technology.  

For example, tool calling — a feature which allows LLMs to invoke functions as a part of completions — has expanded rapidly in the past 6 months.  In the beginning, only OpenAI supported binding tools to an LLM.  However, other chat models could still invoke tools with a more manual process that were often wrapped up and provided by the LangChain community.  Great!  Democracy!  Except now, most ChatModels support tool binding directly.  Reading the documentation, if you wish to use tools with an Anthropic model in AWS Bedrock, you may find there are multiple ways to do so in the documentation — with no clear indication of preference for one method over another.  I have found, through years of experience in software development, you should have an opinion, and having multiple equal solution paths is difficult to maintain over time.

LangChain has split out its core and community functionality into separate Python modules, which makes sense for many reasons but also leads to a disjointed release cycle between dependencies.

And to add more ways to solve problems, we have the aforementioned LangGraph library which adds some nice features to LangChain (cyclical graphs, improved routing, some better bootstraps, some graph state), but again, further fragments the toolset and adds yet more options for how to accomplish the same task with no clear opinion on a “best practice.”

Langchain has Competitors

On a closing note, it would be irresponsible to go “all in” on a LLM framework without awareness of other options out there.  A few alternatives with strong use case arguments are:

  • LlamaIndex - A highly visible competitor to LangChain with a stronger focus on RAG applications, LlamaIndex has strong backing via Meta and will likely continue to grow.
  • Prompt Flow - A fairly new offering in this space, Prompt Flow does a few things I like very much, such as the “promptly” file format for all-in-one configuration and prompt.  Prompt Flow is backed by Microsoft and the company is investing heavily, with some really interesting Azure services coming out of the pipeline.
  • Haystack - Another strong competing offering to LangChain, with many integrations available, definitely worth consideration as an alternative.

Conclusions

I’ve found LangChain to be a valuable tool for developing LLM powered applications due to its useful integrations, abstractions, and orchestration of tasks into a cohesive flow.  As the LLM landscape continues to grow, I believe that LangChain will continue to get better and provide a solid foundation for products moving forward, and I intend to continue using it for the foreseeable future.

Contact Us

We are ready to accelerate your business. Get in touch.

Tell us what you need and one of our experts will get back to you.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.