May 19, 2026· 9 min readTutorialsAI

Dynamic Onboarding in React Native: The Two-Phase LLM Pattern (With Code)

Malik Chohra

Creator of WireAI

The pure-AI onboarding that breaks in production, and the two-phase LLM pattern that holds up: fixed Phase 1 base questions, then one bounded LLM call that generates 2-4 personalized cards. WireAI renders the cards, the LLM picks them, the user moves through onboarding in 60 seconds.

Dynamic onboarding in React Native is a flow where an LLM decides the next question (or the next personalized screen) based on what the user just answered. The pattern that holds up in production is two-phase: a fixed base of 3 to 5 questions you always ask, then a second LLM call that generates 2 to 4 personalized cards from those answers. WireAI renders the components, the LLM picks them, the user moves through onboarding in 60 seconds instead of tapping next 12 times.

This is the commercial wedge for WireAI. Onboarding is the one place every B2C app needs personalization, every PM wants higher activation, and every engineer has shipped a static flow that converts at 40%. Two-phase LLM is the upgrade.

What you need before starting

You should be comfortable with React Native (Expo or bare, Hermes runtime), one cloud LLM that supports JSON mode (OpenAI gpt-4o-mini, Anthropic claude-sonnet-4.5, or local Ollama), and Zod for prop validation. Estimated time to wire a working dynamic onboarding from scratch: 4 to 6 hours. With wireai-rn@0.1.3: 90 minutes.

Why two-phase, not pure-AI?

The instinct is to make the whole onboarding AI-generated. The LLM asks the first question, decides the second, decides the third. I tried that. It breaks in three places:

Latency. Every question is one LLM round trip. A 5-question onboarding turns into 20 seconds of waiting. Users bounce.
Schema drift. The LLM hallucinates question types you do not support. You ship a multiple_choice component, the LLM emits radio_with_image, your render layer crashes.
Cold start. The first question cannot be personalized because the LLM has zero context. You are paying the LLM tax for a question you could have hard-coded.

Two-phase fixes all three. Phase 1: fixed 3 to 5 base questions. Hard-coded. Instant. Cover the must-have signals. Phase 2: one LLM call after Phase 1 generates 2 to 4 personalized cards based on the answers. The LLM is bounded (component schemas validated, count capped, fallback prepared). Total latency: 1 LLM call instead of 5. Validated output. Personalization where it actually matters. This is the pattern that runs in the AI Pro tier of the AI Mobile Launcher boilerplate today. It is the B2C wedge WireAI was built for.

Step 1: Define your Phase 1 base questions

Pick 3 to 5 questions that capture the user's intent, context, and constraint. For a wellness app:

export const PHASE_1_QUESTIONS = [
  { id: 'goal', type: 'choice', text: 'What brought you here?', options: ['Stress', 'Sleep', 'Focus', 'Energy'] },
  { id: 'time', type: 'choice', text: 'How much time do you have daily?', options: ['5 min', '15 min', '30 min', '60+ min'] },
  { id: 'experience', type: 'choice', text: 'Have you tried this before?', options: ['Never', 'A bit', 'A lot'] },
  { id: 'biggest_blocker', type: 'text', text: 'In one sentence: what gets in your way?', maxLength: 140 },
];

Static. No LLM. Render them as WireAI components if you want consistency, or use plain RN screens. Either works. Rule for picking Phase 1 questions: each one has to inform Phase 2 personalization. Do not ask things you will not use. "Date of birth" is a junk question unless your Phase 2 prompt references age.

Step 2: Collect the answers into a structured profile

After Phase 1, you have a typed answers object. Validate it with Zod before passing to the LLM. If the user skipped a question, decide policy: fall back to a default, or refuse to enter Phase 2. I default to "skip the LLM and show a generic onboarding completion screen" if any field is missing, because personalization quality drops fast with incomplete data.

Step 3: Write the Phase 2 system prompt

The system prompt is where most dynamic onboarding implementations fail. Vague prompt, vague cards. Here is the shape that works:

const PHASE_2_SYSTEM_PROMPT = `
You generate personalized onboarding cards for a wellness app based on user answers.

INPUT: a JSON object with the user's answers to 4 questions.

OUTPUT: a JSON object with a "cards" array of 2 to 4 WireAI component directives.

AVAILABLE COMPONENTS (use only these):
- MessageBubble: { title: string, body: string }
- ContentSelectCard: { title: string, options: { id: string; label: string }[] }
- ActionCard: { label: string, action: 'next' | 'skip' }
- StatusCard: { value: number, label: string }

RULES:
1. Return ONLY a JSON object with a "cards" array. No prose.
2. Each item must match the schema above exactly.
3. Maximum 4 items. Minimum 2.
4. The first card must reflect the user's stated goal.
5. The last card must be an ActionCard with label "Start" and action "next".
6. Tone: direct, second-person, no fluff. The user is not your friend.

EXAMPLE INPUT:
{ "goal": "Sleep", "time": "15 min", "experience": "Never", "biggest_blocker": "phone in bed" }

EXAMPLE OUTPUT:
{ "cards": [
  { "action": "render", "component": "MessageBubble", "props": { "title": "Sleep in 15 minutes a day", "body": "Phone-in-bed is the most common blocker. We start there." } },
  { "action": "render", "component": "ContentSelectCard", "props": { "title": "When do you put your phone down?", "options": [{"id":"a","label":"10 min before bed"},{"id":"b","label":"30 min"},{"id":"c","label":"1 hour"}] } },
  { "action": "render", "component": "ActionCard", "props": { "label": "Start", "action": "next" } }
] }
`;

Five things in that prompt do load-bearing work. Available components listed by name and schema (without this the LLM invents components). Maximum and minimum count (without this you get 1 card or 14). The "first card must" rule (without this the LLM opens with a generic welcome and loses the personalization). The "tone" rule (without this the LLM defaults to ChatGPT-warm voice, "Welcome to your journey..."; cut it). The example (one concrete input/output pair raises output quality more than 3 abstract rules).

Step 4: Call the LLM with JSON-mode

Use response_format: { type: 'json_object' } (OpenAI) or the equivalent on Anthropic and Gemini. This forces valid JSON output. Do not trust the LLM to emit JSON without it.

import OpenAI from 'openai';

const openai = new OpenAI({ apiKey: process.env.OPENAI_KEY });

async function generatePhase2Cards(answers: Phase1Answers) {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    response_format: { type: 'json_object' },
    messages: [
      { role: 'system', content: PHASE_2_SYSTEM_PROMPT },
      { role: 'user', content: JSON.stringify({ answers }) },
    ],
  });

  const raw = response.choices[0].message.content!;
  const parsed = JSON.parse(raw);
  return parsed.cards as ComponentDirective[];
}

Latency on gpt-4o-mini for this call: roughly 1.5 to 2.5 seconds in our AI Mobile Launcher dogfood logs (not a controlled benchmark). That is the single LLM call cost. Compare it to 5 sequential LLM calls in the pure-AI approach.

Step 5: Validate and render with WireAI

Validate each card against its Zod schema before rendering. If validation fails, you have two options: discard the card and continue, or show a fallback "let's get started" generic screen.

import { WireAIProvider, WireAIMessageRenderer } from 'wireai-rn';
import { z } from 'zod';

const MessageBubbleSchema = z.object({
  title: z.string(),
  body: z.string(),
});

const validatedCards = rawCards.filter((card) => {
  if (card.component === 'MessageBubble') return MessageBubbleSchema.safeParse(card.props).success;
  // ...other components
  return false;
});

function OnboardingPhase2({ cards }: { cards: ComponentDirective[] }) {
  const [index, setIndex] = useState(0);
  const current = cards[index];
  return (
    <WireAIMessageRenderer
      message={{ id: String(index), kind: 'component', payload: current }}
      onAction={() => setIndex((i) => Math.min(i + 1, cards.length - 1))}
    />
  );
}

WireAI's renderer does this validation internally if you use the standard useWireAIThread hook. The manual version above is what to use if you want explicit control over fallback behavior. For more on the built-in component contract, see What Generative UI Is.

Step 6: Handle the failure modes

Three failure modes you will see in production:

The LLM returns 0 cards. Rare with JSON mode, but happens when the model decides the input is unprocessable. Fallback: render a generic 2-card onboarding completion flow. Do not crash.
Schema validation fails on one card. Common when you add a new component to your registry and forget to update the system prompt. Fix: keep the prompt and the registry in the same file, lint that they are in sync, or use WireAI's built-in component list which is already wired.
The latency spikes to 8+ seconds. Network or model variance. Show a loading state with specific progress text ("Personalizing your first few steps...") rather than a generic spinner. Users wait longer when they see specific text.

What still does not work

Streaming Phase 2. The cards arrive as a single JSON array. You cannot show card 1 before card 4 is generated. WireAI streaming plus JSON mode is a roadmap item.
Voice input in Phase 1. Possible, not built. The biggest_blocker text question is the place most users want voice; today it is keyboard only.
Multi-language personalization. The system prompt is English. For German or French apps, you duplicate the prompt per locale and add a locale parameter. Or run a translation step. Not zero-effort.
On-device LLMs. The Phase 2 call works against Ollama (local) with WireAI's Ollama adapter (see the Ollama setup guide). Latency is higher (5 to 10s) and quality is lower than gpt-4o-mini. Fine for privacy-sensitive flows, not for fast B2C activation.

Variants other B2C apps could ship

Once you have the two-phase pattern wired, the variations are cheap:

Goal-setting flows. Phase 1 = where they want to go. Phase 2 = personalized milestone cards.
Fitness app intake. Phase 1 = level, time, equipment. Phase 2 = first-week workout plan as cards.
Habit tracker. Phase 1 = habit, frequency, blocker. Phase 2 = personalized cue and reward cards.
EdTech enrollment. Phase 1 = current skill, target, time per week. Phase 2 = first lesson preview and commitment card.

The pattern is the same. Only the prompt changes.

Where to start

npm install wireai-rn@0.1.3 zod

Dynamic Onboarding glossary at getwireai.com/glossary/dynamic-onboarding. Repo at github.com/chohra-med/wireai-rn (MIT). The two-phase template (env vars, system prompts, Zod schemas above) ships in the AI Pro tier of the AI Mobile Launcher boilerplate at aimobilelauncher.com, where it runs as the actual onboarding flow. Weekly mobile-AI issue at codemeetai.substack.com.

Built by Malik Chohra. 9 years React Native. Shipped an app at 9M monthly users, took a consumer app from a 4.3 to a 4.9 App Store rating, and ran AI spec workflows as App Lead at a Series A startup.

FAQ

What is dynamic onboarding in a React Native app?

Dynamic onboarding is a flow where the questions or screens adapt to the user's answers, usually driven by an LLM. The two-phase pattern is the most reliable shape: fixed base questions (Phase 1, hard-coded), then one LLM call (Phase 2) that generates personalized cards based on the answers. WireAI renders the personalized cards with validated schemas.

Why not let the LLM drive every onboarding question?

Three reasons: latency multiplies (one LLM call per question equals 20+ seconds total), schema drift becomes common (the LLM emits unsupported component types), and the first question cannot be personalized anyway (no context yet). Two-phase keeps Phase 1 instant and bounds the LLM to where personalization actually pays off.

How long does the Phase 2 LLM call take?

In our logs against gpt-4o-mini for a 4-question Phase 1 with a constrained system prompt: roughly 1.5 to 2.5 seconds. Anthropic claude-haiku-4.5 runs similar. On-device Ollama (Llama 3 8B) takes 5 to 10 seconds depending on the device.

Do I need WireAI to build two-phase onboarding?

No. You can hand-roll the rendering layer if you are comfortable writing JSON-to-component glue and a Hermes-safe streaming layer (if you stream Phase 2 in the future). WireAI gives you 11 built-in components with Zod schemas, an adapter pattern for multiple LLMs, and the Hermes streaming polyfill for free. The decision is whether the 90-minute setup is worth more than 4 to 6 hours of hand-rolling.

Which LLM is best for two-phase onboarding?

gpt-4o-mini (OpenAI) is the cheapest cloud option that reliably emits valid JSON-mode output for this prompt shape. claude-haiku-4.5 (Anthropic) is comparable and slightly better at following the "tone" rule. For local and private flows, Ollama with Llama 3 8B works but with higher latency and looser schema adherence. Pick based on cost, privacy, and how much you trust the model's JSON discipline.

Can I localize dynamic onboarding to other languages?

Yes, but you will duplicate the system prompt per locale (or run a translation step before rendering). The Phase 1 questions translate normally. The Phase 2 prompt needs locale-specific examples in the system message to produce idiomatic output. Doing this properly across 5+ languages adds a non-trivial maintenance load; plan for it before scaling internationally.