Strip md formatting

Benchmark created on


Setup

  // This regex is designed to remove common markdown formatting.
  const markdownRegex1 = new RegExp(
    [
      /!\[.*?\]\(.*?\)/g, // Images: ![alt text](url)
      /\[.*?\]\(.*?\)/g, // Links: [link text](url)
      /^#{1,6}\s/gm, // Headers: #, ##, etc.
      /(\*\*|__)(.*?)\1/g, // Bold: **text** or __text__
      /(\*|_)(.*?)\1/g, // Italics: *text* or _text_
      /~~(.*?)~~/g, // Strikethrough: ~~text~~
      /`{3,}[\s\S]*?`{3,}/g, // Code blocks: ```code```
      /`(.*?)`/g, // Inline code: `code`
      /^\s*[-*+]\s/gm, // Unordered lists: -, *, +
      /^\s*\d+\.\s/gm, // Ordered lists: 1., 2.
      /^>\s/gm, // Blockquotes: >
      /^-{3,}\s*$/gm, // Horizontal rules: ---
    ]
      .map((r) => r.source)
      .join('|'),
    'g'
  );

function removeMarkdown1(markdown) {


  return markdown.replace(markdownRegex1, '').trim();
}
function removeMarkdown(markdown) {
  // This regex is designed to remove common markdown formatting.
  const markdownRegex = new RegExp(
    [
      /!\[.*?\]\(.*?\)/g, // Images: ![alt text](url)
      /\[.*?\]\(.*?\)/g, // Links: [link text](url)
      /^#{1,6}\s/gm, // Headers: #, ##, etc.
      /(\*\*|__)(.*?)\1/g, // Bold: **text** or __text__
      /(\*|_)(.*?)\1/g, // Italics: *text* or _text_
      /~~(.*?)~~/g, // Strikethrough: ~~text~~
      /`{3,}[\s\S]*?`{3,}/g, // Code blocks: ```code```
      /`(.*?)`/g, // Inline code: `code`
      /^\s*[-*+]\s/gm, // Unordered lists: -, *, +
      /^\s*\d+\.\s/gm, // Ordered lists: 1., 2.
      /^>\s/gm, // Blockquotes: >
      /^-{3,}\s*$/gm, // Horizontal rules: ---
    ]
      .map((r) => r.source)
      .join('|'),
    'g'
  );

  return markdown.replace(markdownRegex, '').trim();
}

function removeMarkdownChained(markdown) {
  let text = markdown;

  // Remove images
  text = text.replace(/!\[.*?\]\(.*?\)/g, '');
  // Remove links
  text = text.replace(/\[.*?\]\(.*?\)/g, '');
  // Remove headers
  text = text.replace(/^#{1,6}\s/gm, '');
  // Remove bold
  text = text.replace(/(\*\*|__)(.*?)\1/g, '$2');
  // Remove italics
  text = text.replace(/(\*|_)(.*?)\1/g, '$2');
  // Remove strikethrough
  text = text.replace(/~~(.*?)~~/g, '$1');
  // Remove code blocks
  text = text.replace(/`{3,}[\s\S]*?`{3,}/g, '');
  // Remove inline code
  text = text.replace(/`(.*?)`/g, '$1');
  // Remove unordered lists
  text = text.replace(/^\s*[-*+]\s/gm, '');
  // Remove ordered lists
  text = text.replace(/^\s*\d+\.\s/gm, '');
  // Remove blockquotes
  text = text.replace(/^>\s/gm, '');
  // Remove horizontal rules
  text = text.replace(/^-{3,}\s*$/gm, '');

  return text.trim();
}


const md = `# Comprehensive Markdown Document

## Introduction

Welcome to this **comprehensive** sample of markdown. This document is designed to test various markdown removal functions. It includes _most_ of the common markdown syntax you'll encounter. You can use this to verify that your text processing logic works as expected.

Here is a link to the [official Markdown guide](https://www.markdownguide.org).

---

## Text Formatting

Here are some examples of text formatting:

*   **Bold text** using asterisks.
*   __Bold text__ using underscores.
*   *Italic text* using asterisks.
*   _Italic text_ using underscores.
*   ~~Strikethrough text~~.

## Lists

### Unordered List

*   First item
*   Second item
    *   A nested item
*   Third item

### Ordered List

1.  Step one
2.  Step two
3.  Step three
    1.  A nested step
4.  Step four

## Block Elements

> This is a blockquote. It's often used for quoting text from another source.
>
> It can span multiple lines.`

Test runner

Ready to run.

Testing in
TestOps/sec
Multipe Passes
removeMarkdownChained(md)
ready
Single pass
removeMarkdown(md)
ready
Single pass + pre initialized
removeMarkdown1(md)
ready

Revisions

You can edit these tests or add more tests to this page by appending /edit to the URL.