Building a RAG app with Node and React Native: Part 1

The code for this project can be found on GitHub

https://github.com/tanner-west/wikichat

If you keep up with the world of AI-powered apps, then you've likely heard of RAG (Retrieval Augmented Generation). It's a way to generate answers with an LLM like GPT by providing context retrieved from some specific source, like an article or blog post. In this series of posts, I'll document my journey of building a RAG app using React Native, Langchain, and Chroma that lets users chat with Wikipedia articles.

Architectural Overview

Here's a visual overview of how the first iteration of this app is architected.

Architecture diagram of the RAG app

If you're new to RAG apps, the Langchain docs contain a great overview of the concepts and technologies involved, but I'll define some key terms here.

In this first iteration, our React Native app is truly just a client for collecting user queries and displaying the LLM's answers. But thanks to projects like llama.rn, it's entirely feasible to do much if not all of the AI work on a user's device. I may explore that avenue in a future post.

The Server

I'll explain the server first, since that's where most of the interesting work is done in this app. Here are the main components:

An http server that exposes a single endpoint, POST /api that expects a request payload with this two properties:


A query service function takes the request data and that invokes Langchain to perform steps 2, 4, and 5 from the diagram above, namely:

The Client

Our app client is currently a bare-bones Expo app that lets the user select from a hard-coded list of Wikipedia articles and submit a question about it. It calls our api endpoint with the data and presents the user with the result. Here's a short demo of the app in action.


Clearly this is only the most minimum example of a RAG app built with React Native. In future posts I'd like to show how we can add the following features.