Feature request: optional WFGY 16-problem RAG debugger for pandas-ai

### &#128640; The feature

I would like to propose an **optional diagnostic add-on** for pandas-ai that uses the open source **WFGY 16 Problem Map** as a failure taxonomy for RAG-style workflows.

Concretely, this could be any of the following (whichever fits your roadmap best):

- a short &ldquo;debugging RAG failures&rdquo; recipe in the docs, showing how to send a failed pandas-ai query (prompt, generated code, error, retrieved context) into an external WFGY debugger script, and get back a Problem Map number `No.1&ndash;No.16` plus a suggested fix, or  
- a small optional helper / callback that users can plug into their existing pandas-ai pipelines to dump a failed interaction into WFGY for classification.

The WFGY debugger itself is just plain Python that talks to any OpenAI-compatible endpoint (no extra infra). It reads the WFGY Problem Map text files and uses them as a 16-mode failure map (hallucination, retrieval drift, bootstrap ordering, config drift, etc.) and then returns:

- one primary ProblemMap number `No.1&ndash;No.16`
- an optional secondary candidate
- a short explanation and pointer to the corresponding WFGY doc.


### Motivation, pitch

A lot of pandas-ai users are effectively building **lightweight RAG systems over tabular or mixed data**: LLMs generate code, hit the database, call tools, and then answer in natural language. When something goes wrong, it is often hard to tell *what kind of failure* it is:

- sometimes the retrieval is wrong (wrong file, wrong table, stale embedding),
- sometimes the reasoning is wrong even though the data is fine,
- sometimes the infra or config is wrong (missing secrets, startup races, version drift, etc.).

WFGY Problem Map is an open source taxonomy of **16 common AI system failure modes**, originally built for RAG debugging. It focuses on things like:

- No.1 hallucination / chunk drift,
- No.2 interpretation collapse,
- No.5 embedding vs semantic mismatch,
- No.8 missing retrieval traceability,
- No.14 bootstrap ordering issues,
- No.16 pre-deploy / secrets drift,

and so on.

Right now pandas-ai already gives users a lot of power for building RAG-style workflows. A small, optional integration or recipe that says &ldquo;when something weird happens, send the trace through this 16-mode debugger and see *what kind* of failure it is&rdquo; could make debugging much easier, especially for less experienced users.

My goal here is not to change any core behaviour, but to **offer a standard vocabulary for failures** that many pandas-ai users could share, and to give them a concrete next step when things break.


### Alternatives

Right now the main alternative is to run WFGY **completely outside** of pandas-ai:

- when a pandas-ai call fails in a confusing way, the developer manually copies the prompt, generated code, logs, and context into a separate WFGY debugger notebook,
- the notebook classifies the failure into one of the 16 Problem Map modes and suggests a fix.

This works in practice, but it is a bit clumsy and most users will never discover it unless there is some official hook or recipe in the pandas-ai ecosystem.

Another alternative is of course to build a completely separate, pandas-ai-specific taxonomy of failures. My suggestion here is to reuse an existing open source one (WFGY) so that different tools and repos can eventually speak the same language about RAG / LLM failures.


### Additional context

If this sounds interesting and in-scope, I am happy to:

- prepare a small example notebook that shows how to pipe a pandas-ai failure into the WFGY debugger and interpret the output, or
- draft a short docs section / recipe PR that you can adjust to match your style.

Relevant links:

- WFGY main repo (MIT): https://github.com/onestardao/WFGY  
- WFGY Problem Map overview: https://github.com/onestardao/WFGY/tree/main/ProblemMap#readme  

Either way, totally fine if you feel this is out of scope for pandas-ai. I mainly wanted to ask before sending any PR. Thanks a lot for maintaining this project and for considering the idea.


Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: optional WFGY 16-problem RAG debugger for pandas-ai #1868

🚀 The feature

Motivation, pitch

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Sunbelt Computer Software

PL/B Language Development and Support

Feature request: optional WFGY 16-problem RAG debugger for pandas-ai #1868

Description

🚀 The feature

Motivation, pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions