What is RTF?
Rich Text Format (RTF) is a document interchange format specified by Microsoft in 1987. It was designed as a neutral exchange between Microsoft Word, WordPerfect, and other word processors of the era — a text-based format that any of them could read and write, preserving the kind of basic formatting that mattered in pre-web publishing: fonts, colors, paragraph spacing, bold, italic, underline. The format is a plain-text document wrapped in braces, full of backslash-prefixed control words.
RTF peaked in the 1990s and early 2000s as the lingua franca of word-processor exchange. Microsoft continued to update the spec (the last significant revision was RTF 1.9 in 2008), and RTF remained the default format for cross-platform clipboard paste in macOS and Windows into the 2010s. Today, RTF is a legacy format. Word documents travel as .docx, modern email clients use HTML, and most file exchange happens in PDF or Markdown. But RTF files still exist — in archives, in enterprise content management systems that ran their document pipeline on pre-.docx Word, in system clipboard cuts from Mail.app and TextEdit, and in the export options of older consumer software.
The conversion problem usually looks like this: someone has a folder of old RTF files — meeting minutes, drafts, form letters, technical specs — that they want in a modern Markdown-based system. The content is worth preserving; the format is not.
Why migrate to Markdown?
The practical reasons line up with the general case for Markdown. RTF renders in fewer and fewer contexts — modern web apps don't render it, note-taking platforms don't accept it, static site generators can't use it. Content locked in RTF is essentially offline content. Moving it to Markdown makes it searchable, diffable in git, viewable on any platform with a text editor, and editable with the same tools as the rest of your content.
A second reason specific to RTF: the format was designed to preserve formatting that we now consider noise. Font name, size, color, paragraph indentation — fine for a 1990s print workflow, not relevant for a 2026 Obsidian vault or Hugo site. The migration is as much about stripping decoration as preserving content. RTF-to-Markdown conversion naturally removes the visual cruft and keeps the structural text.
Third: RTF is genuinely hard to read as source. The control-word syntax (\rtf1\ansi\deff0{\fonttbl...}) makes manual editing unpleasant. Markdown source is approximately the plain text of the document with a few prefix characters. Anyone touching the content later will thank you for the format.
What you lose in the migration: non-default fonts (RTF can specify any installed font), text color (MD has no color syntax without HTML), exact paragraph spacing, tab-stop rulers, table cell formatting, and embedded images or objects. If any of those matter for your use case — regulatory archives, legal documents where formatting is part of the record — RTF may still be the right format to keep alongside the Markdown. But for content where the text matters and the visual styling is incidental, Markdown is strictly better.
Manual approach
RTF is not practical to convert by hand. The control-word syntax is designed for programs, not humans. A typical RTF file starts like this:
{\rtf1\ansi\deff0{\fonttbl{\f0 Times New Roman;}{\f1 Helvetica;}}
{\colortbl;\red0\green0\blue0;}
\pard\fs24 The quick brown fox jumps over \b the lazy dog.\b0\par
}
To hand-convert, you have to: strip the outer {\rtf1...} wrapper, find and delete the font table ({\fonttbl...}), find and delete the color table ({\colortbl...}), identify \pard and \fs24 as paragraph and font-size controls (delete them), translate \b ... \b0 toggles to **, translate \i ... \i0 to *, convert \par to paragraph breaks, and handle any \'XX hex escapes.
For a short document, that's a lot of careful work. For a folder of 50 files, it's impractical. The right call for manual handling is: don't. Use a tool.
Even simple visual inspection of RTF is hard. Finding "where does the actual text start" requires scanning past the font and color tables, which can be large. A 10-page document's RTF source is often 50 KB of markup around 20 KB of content.
Automated approach (our tool)
Our converter handles the common RTF subset. You paste your RTF source (or the contents of a .rtf file opened as plain text), and get Markdown out. Coverage:
- Document wrapper: the outer
{\rtf1\...}braces and version declaration are stripped - Font / color / style / info tables: all stripped (we don't preserve font specifics, by design)
- Paragraph breaks:
\parcontrol words become blank-line paragraph separators - Inline formatting toggles:
\b text \b0becomes**text**;\i text \i0becomes*text*;\ul text \ul0becomes**text**(Markdown doesn't have underline, so we fall back to bold — safer than silent drop) - Hex escapes:
\'XX(where XX is two hex digits) becomes the literal character for that code point. Works correctly for ASCII; best-effort for Windows-1252 characters in a document encoded that way - Other control words:
\fsN(font size),\cfN(color),\fN(font),\liN(indent),\qN(alignment),\lang,\rtlch,\ltrch, and the dozens of others are stripped. Their formatting effect does not survive the conversion. - Remaining braces: dropped after control words are processed, since groups carry no semantic meaning in the plain-text output
Not covered: embedded images ({\pict...}), tables (\trowd row definitions), fields ({\field...} merge fields, hyperlinks defined via fields), and mail-merge structures. These pass through with their control-word syntax stripped but without meaningful conversion. You may see residual stray text that needs hand-cleaning.
Common gotchas
Accented characters from Windows-1252 source. Older RTF files often use Windows-1252 encoding. \'e9 should be é (acute e). Our converter uses String.fromCharCode(0xE9) which produces the Latin-1 é — correct in this case. For characters beyond 0x7F, Windows-1252 and ISO-8859-1 agree for most printable characters but diverge for 0x80–0x9F. If your source uses those code points (curly quotes, em-dash, euro sign), you may need a post-conversion pass to replace them with the correct Unicode: \'91 → ' (left single quote), \'92 → ' (right single quote), \'93 → ", \'94 → ", \'96 → –, \'97 → —.
Paragraph spacing may collapse. RTF uses \par for paragraph breaks. Our converter replaces each \par with a blank line (two newlines), then normalizes any run of three or more newlines back to two. Consecutive empty paragraphs in the source become a single break in the output. Usually that's the right thing. If you want empty paragraphs preserved as-is, hand-edit after conversion.
Underline becomes bold, not a specific annotation. Markdown has no native underline syntax — the only way to get an underline is inline HTML (<u>text</u>). Our converter opts for bold as the safest fallback. If you need underlines preserved for some specific reason (legal documents, edit-trail notation), either accept the loss or post-process to replace **...** with <u>...</u> where underline was the original intent. You would need source-level annotation to distinguish; our converter does not track which bolds came from \b versus \ul.
Embedded images drop silently. {\pict...} image data in RTF is base64-encoded (or binary, or Windows metafile hex) image content. Our converter strips it along with other { } groups. The image is lost. If your RTF contains important images, extract them separately before conversion (open the file in Word or TextEdit, save each image, then run the text through our tool).
Tables come through as raw text. RTF table syntax (\trowd, \cellx, \cell, \row) is stripped but the cell contents flow together without structure. You'll get the cell text in reading order with no table formatting. If tables matter, hand-convert each one to GFM pipe syntax, or use a dedicated RTF table parser before running through our tool.
Hyperlinks defined via fields. Newer RTF uses {\field{\*\fldinst HYPERLINK "url"}{\fldrslt display}} for hyperlinks. Our converter strips the field structure, leaving the display text but losing the URL. If hyperlinks matter, either pre-process the source to extract them or use Pandoc.
Font changes don't translate. If your RTF had blocks in a fixed-width font (code samples, typically set in Courier or Consolas), our converter can't know that was code. All text flows as prose. Hand-wrap code samples in fenced code blocks after conversion.
When to use Pandoc instead
Pandoc has a mature RTF reader that handles the structural content our tool skips. Tables, embedded images (extracted to a media directory), hyperlink fields, and mail-merge structures all go through correctly. The invocation is pandoc input.rtf -f rtf -t gfm -o output.md, optionally with --extract-media=./images to save embedded images as separate files.
For bulk RTF archives, Pandoc plus a shell script is the way: iterate over every .rtf in a directory, convert each one, save to a parallel Markdown directory. Our browser tool handles the ad-hoc cases (a single RTF attachment, a clipboard copy from an old email) without requiring a local install.
Pandoc's RTF coverage is among the most complete of any free tool. If you have any volume of RTF to migrate, setting up Pandoc is worth the ten minutes it takes.
Migration workflow
For a typical RTF-to-Markdown migration:
- Inventory your source. Count RTF files, rough page count, and survey for tables and images. If tables / images are frequent, plan to use Pandoc. If the files are mostly prose, our tool is appropriate.
- Pre-extract images. If any RTF files have embedded images you want to keep, open each in Word / TextEdit / LibreOffice, save images to a folder, note the filenames. You'll reference them from the converted Markdown.
- Convert each file. Paste the RTF source into our tool for a few files at a time. For larger batches, scripting Pandoc is faster.
- Post-process Windows-1252 characters. Search output for mojibake — any
Â,Ã, or unrenderable characters suggest encoding issues. Run a pass to normalize to Unicode equivalents. - Re-embed images. Find each point in the Markdown where an image was in the original RTF, add
. - Hand-convert any tables. Check the output against the source for tables — our tool leaves them as flowing text. Either hand-write GFM pipe tables or re-run that file through Pandoc.
- Add frontmatter. If your target (Jekyll, Hugo, Obsidian) uses YAML frontmatter, add title, date, and tags to each file.
- Review output. Open each converted file in a Markdown preview. Spot-check that bold and italic are applied correctly, paragraph breaks are sensible, and no control-word residue is visible.
A small RTF archive (20–50 files, mostly prose) takes two to four hours with our tool. A larger archive (hundreds of files, complex formatting) is a Pandoc + scripting job measured in half-days to days.