Markdown-to-plain text — the lossy direction
Markdown-to-plain is the conversion that strips every Markdown marker out of your source and gives you readable prose. Bold, italic, links, code, headings, lists, blockquotes — all the syntax that reads as noise in destinations that don't understand Markdown — are removed. What survives is the text in reading order: paragraphs, list items, quoted passages, and the actual content inside formatting markers.
Calling this conversion lossy is honest framing. You lose information by design. The output cannot be round-tripped back to Markdown without manual reconstruction; the headings are gone, the link URLs are gone, the code-block boundaries are gone. That is the point — you wanted text without formatting, and that is what you get.
The conversion is useful when the destination doesn't render Markdown and isn't going to: a plain-text email body, a CRM note field, a customer-support reply, a job-posting description, an LLM prompt where Markdown markers add tokens without adding meaning, an SMS that has no formatting to begin with. Pasting raw Markdown into those destinations leaves the asterisks and brackets visible to the reader, which always looks unintentional.
Why strip Markdown at all
The recurring use cases concentrate around three workflows. The first is communication tools that don't render Markdown. Email clients that show plain-text mode, CRM systems with text-only note fields, customer-support consoles that escape HTML, ticketing systems with limited formatting, traditional SMS — all places where pasting **important** produces literal **important** instead of bold text. Stripping the syntax before pasting keeps the message readable.
The second is feeding text to language models. LLMs accept Markdown-formatted prompts and many providers explicitly recommend it for instruction structure. But for content prompts — long article excerpts, transcripts, document fragments — the Markdown markers add tokens without changing the model's understanding. Strip the formatting, save tokens, get the same response quality.
The third is data export and analytics. When Markdown content gets fed to text analytics, search indexing, sentiment analysis, or full-text search engines, the markup tokens distort word counts and token frequency. Stripping to plain produces cleaner input for downstream processing. The same logic applies to feeding text into translation tools that don't preserve Markdown syntax cleanly.
A fourth case is preview and audit. When you want to see what your Markdown actually says — what a non-Markdown reader would experience — the plain output is a quick check. It surfaces missing whitespace, accidental code-block scoping, and link text that doesn't make sense without the URL hint.
How the engine handles each construct
Different Markdown constructs are handled differently in plain output:
Headings lose their # markers and become regular text lines. There is no preserved hierarchy — # Title becomes Title, ## Subtitle becomes Subtitle, with no indentation or marker indicating they were headings.
Bold and italic lose their markers and the inner text is kept as regular text. **important** becomes important. *emphasized* becomes emphasized. ~~struck through~~ becomes struck through.
Inline code loses the backticks but keeps the content. `function()` becomes function().
Code blocks lose their fence markers (```) and the inner content is kept as text. The block is followed by a paragraph break. There is no language identifier preserved — the language hint after the opening fence is dropped because plain text has no place for it.
Links keep the link text and drop the URL. [click here](https://example.com) becomes click here. The URL is gone from the output. This is one of the more destructive changes, and worth understanding upfront.
Bullet lists keep the items, replacing -/*/+ markers with • (a bullet character). Nested lists flatten to the top level.
Numbered lists keep the items with their numbering. 1. First, 2. Second stay as 1. First, 2. Second.
Blockquotes keep their > prefix. A blockquoted paragraph reads as > The original text in the plain output. This makes blockquotes still readable as quoted material even without Markdown rendering. If you want the blockquote markers gone too, post-process the output to strip leading > from lines.
Tables in the source pass through approximately. The pipe characters and alignment row are removed; cell text is joined.
Line breaks inside paragraphs are preserved as plain newlines. Paragraph breaks (blank lines) become double newlines in the output.
Before/after — what reading order looks like
A short documentation excerpt with mixed formatting goes in:
## Quick start
Install the package via npm:
npm install @vust/markdown
Then import the `convert()` function and call it with your text. **Note**: the API is stable across the 1.x release line.
For more detail, see the [API reference](https://vust.ai/docs/markdown).
And comes out as plain text:
Quick start
Install the package via npm:
npm install @vust/markdown
Then import the convert() function and call it with your text. Note: the API is stable across the 1.x release line.
For more detail, see the API reference.
Notice three things: the heading retains its content but loses the ## marker; the inline code keeps its text without backticks; the link's URL is gone, only the text "API reference" remains.
A list with two levels goes in:
- First level item
- Nested item with **emphasis**
- Another nested item
- Another first-level item
And comes out as:
• First level item
• Nested item with emphasis
• Another nested item
• Another first-level item
Nested levels flatten — every item is a top-level bullet.
A blockquote with attribution goes in:
> The mind is everything. What you think, you become.
>
> — Buddha (attributed)
And comes out as:
> The mind is everything. What you think, you become.
>
> — Buddha (attributed)
The blockquote prefix stays. If you want it gone, strip leading > from each line in your destination.
What information is intentionally lost
The following Markdown content is removed, not preserved, in the plain output:
- Heading levels and the visual hierarchy they imply
- Bold, italic, strikethrough, and underline emphasis
- Link target URLs (text is kept; URL is gone)
- Code block language hints
- Code fence boundaries (content is kept as plain text)
- Inline code highlighting (content is kept; backticks gone)
- Table column alignment
- Image references (
becomes empty or alt-only depending on parsing) - Footnote references and definitions
- Reference-style link definitions
- HTML pass-through tags
This is the conversion's contract. The output is text in reading order, optimized for destinations that don't render any of these constructs.
Edge cases for LLM prompts and CRMs
LLM prompt token reduction. Stripping Markdown saves measurable tokens. A 1,000-word article with rich Markdown formatting (headings, bullet points, links, code blocks, bold) typically gets around 1,200 to 1,400 tokens; the same content as plain text is closer to 1,050 to 1,150. Over a long prompt with retrieved context, the savings compound.
CRM note field paste. Most CRMs (Salesforce, HubSpot, Pipedrive, Close) have plain-text note fields. Markdown source pasted there leaves visible asterisks and brackets. Convert before paste; the note reads naturally.
Customer support replies. Some support consoles render Markdown, many don't. When you draft a reply in Markdown for the structure but need to paste into a tool that doesn't render it, run through the plain converter first. The reply reads as polished prose.
Email plain-text body. Modern email is HTML, but transactional pipelines and some clients show plain-text fallback. If your campaign needs both HTML and plain-text bodies, draft in Markdown, render to HTML for the rich version, run through plain conversion for the fallback.
Search and full-text indexing. When indexing content for search (Algolia, ElasticSearch, Postgres full-text), Markdown markers in the indexed text reduce relevance. Strip first, index the plain version, link back to the Markdown source.
Translation pipelines. DeepL and Google Translate handle Markdown unevenly — sometimes preserving syntax, sometimes mangling it. For best results, strip to plain, translate, then add Markdown formatting back if needed for the destination.
Recovering structure later
Once Markdown has been stripped to plain, getting the formatting back is manual. There is no plain-to-Markdown conversion that recovers original heading levels, link URLs, or emphasis — that information is gone from the input.
The practical workflow is to keep the Markdown source separately. Export to plain when you need to paste somewhere that doesn't render formatting, but archive the Markdown version in your knowledge base, version-control system, or document store. Re-export to plain when needed; do not try to round-trip.
If you have plain text and want to recover approximate Markdown structure (paragraph breaks, list-like patterns), the dedicated plain-to-markdown route handles that — but it cannot recreate links or headings that weren't preserved as syntax in the plain source.
The Telegram bot at @vustMarkdownBot runs the same plain-text conversion engine. Send Markdown content to the bot and request the plain output via the corresponding template; the response is ready to copy. For high-volume conversions, the bot is the better path; for one-off conversions, the web tool is free within daily limits.
The takeaway: Markdown-to-plain is a one-way trip optimized for destinations without Markdown rendering. The conversion does what it says — strips formatting, keeps content, leaves you with text in reading order — and tells you upfront which information does not survive.