unified is an interface for processing text with syntax trees and transforming between them.
The unified library itself is a small module. It’s a rather small API. Plugins do everything else: minify HTML, lint markdown, check indefinite articles (“a”, “an”), and more.
Three syntaxes are connected to unified, each coming with a syntax tree definition, and a parser and stringifier: mdast with remark for markdown, nlcst with retext for prose, and hast with rehype for HTML.
unified defers part of its logic to vfile, which is a virtual file format representing documents being processed, and unist, a schema for syntax trees.
vfile stores metadata about documents being processed (often, but not always, from the file system). Mainly, it houses a path to files, and their contents. Additionally, it tracks messages associated with files and where they occurred. This powers code linting, shown below with remark-cli, remark-validate-links, and remark-preset-lint-consistent.
Gatsby uses unified to process markdown for blazing fast static site generation
debugger by Mozilla uses unified to check their markup and prose
unist discloses documents as syntax trees. Syntax trees come in two flavours: Concrete (CST) and Abstract (AST). The first has all information needed to restore the original document completely, the latter does not. But, ASTs can recreate an exact syntactic representation. For example, CSTs house info on style such as tabs or spaces, but ASTs do not. This makes ASTs often easier to work with.
For example, say we have the following HTML element: