flowchart TD A(Return schema) B(Use string methods to create a <br> marmaid-compliant string <br> from your dataframe) C(Save string to .mmd file) D(Call file in mermaid cell block of quarto document) A --> B --> C --> D
Dynamic Documentaion with Quarto
A common pattern to create dynamic documation
Introduction
Visual documentation is great for grabbing your readers attention but keeping it up to date can be a pain. Often due to the overhead of this task your documentation can drift from the reality of a project and over time go out of fashion. Now no one is using the documentation you spent so long creating because it does not reflect the current state of the code - inaccuracies or incompleteness means other developers (and yourself) have left it behind. You now rely on tacit knowledge1 to keep your project alive. If you want to continue to receive vindication for all the hard hours that have been spent creating this software, that person better not leave the organisation.
But there is light at the end of the tunnel - in recent years there has been a real push to create dynamic documentation - as your code evolves, so does the documentation; automatically. Packages like Sphinx in the pythonic world are used often (see the Numpy documentation), and Quarto is already used by libraries to create thorough and visually intuitive documentation - see Ibis or FastHTML. I will show you a pattern to create dynamic documentation that takes advantage of Quarto’s integration with Mermaid diagrams and can be used to render HTML documentation that, when refreshed, will update to reflect the current state of a project.
The process
For this example we will create a couple of toy tables to demonstrate how an entity relationship diagram can be displayed in your quarto-rendered documentation. You would want to replace this step with a dynamic call to return a dataframe from your underlying database schema. For example, using a DBI connection to a SQL database, you could run
```{sql}
SELECT
table_name,
column_name,
data_type,
FROM INFORMATION_SCHEMA.COLUMNS
```
This is what the process looks like from a high level:
Let’s focus in on a specific example to give some clarity
Ok, say you want to use mermaid to map an ERD of a database schema you have created as part of a project. You can use dbplyr
to query the information schema of this database. Use glue
and dplyr
to iterate through the resulting dataframe to create a mermaid-compliant string. This can then be written to a .mmd
file and referenced in your Quarto-rendered HTML file.
I will use a toy dataset similar to what you could return with the above SQL call:
# use data that ships with dplyr
library(dplyr)
# create a tibble of table metadata for band_members
# & band_instruments tables similar to what you'd get
# querying a database tables information schema
<- c("band_members", "band_instruments") |>
tables ::map_df(
purrr~ tibble(
table_name = .,
column_name = names(get(.)),
data_type = purrr::map_chr(get(.), class)
)
)
tables
# A tibble: 4 × 3
table_name column_name data_type
<chr> <chr> <chr>
1 band_members name character
2 band_members band character
3 band_instruments name character
4 band_instruments plays character
Reshape data & save .mmd
file
Below we are reshaping our dataframe into a mermaid-compliant string and saving a .mmd
file out to our working directory
library(glue)
# Build Mermaid blocks for each table
<- tables |>
er_blocks group_by(table_name) |>
reframe(
block = glue(" {data_type} {column_name}")
|>
) group_by(table_name) |>
summarise(
table_block = glue("{first(table_name)} {{{paste(block, collapse = ' ')}}}"),
.groups = "drop"
|>
) pull(table_block)
# Combine all table blocks into one Mermaid diagram
<- glue("erDiagram {paste(er_blocks, collapse = ' ')}")
er_diagram
# write string to a mermaid file using .mmd suffix
write(er_diagram, "erd.mmd")
Create mermaid ERD
Call the file by including in a mermaid code block using the file
argument:
```{mermaid}
eval: true
%%| "erd.mmd"
%%| file: ```
Which renders an ERD in your quarto output:
erDiagram band_instruments { character name character plays} band_members { character name character band}
Improvements to this workflow
- Expand on the EDR by adding relationships between tables
- Create other types of mermaid diagram
- A Gantt chart that dynamically updates with task timelines
- Flowcharts are also cool for conveying high level information (and used in this article)
- There could be scope for a package that transforms dataframes into mermaid-compliant strings. I’m not sure if one already exists.
Footnotes
knowledge kept inside someones head↩︎