Php remove tag with content

$str = 'some text tag contents more text ';

My questions are: How to retrieve content tag <em>contents </em> which is between <MY_TAG> .. </MY_TAG>?

And

How to remove <MY_TAG> and its contents from $str?

I am using PHP.

Thank you.

asked Mar 4, 2010 at 18:19

2

For removal I ended up just using this:

$str = preg_replace('~<MY_TAG(.*?)</MY_TAG>~Usi', "", $str);

Using ~ instead of / for the delimiter solved errors being thrown because of the backslash in the end tag, which seemed to be an issue even with escaping. Eliminating > from the opening tag allows for attributes or other characters and still gets the tag and all of its contents.

This only works where nesting is not a concern.

The Usi modifiers mean U = Ungreedy, s = include linebreak characters, i = case insensitive.

answered Aug 20, 2013 at 17:39

squarecandysquarecandy

4,6262 gold badges33 silver badges43 bronze badges

1

If MY_TAG can not be nested, try this to get the matches:

preg_match_all('/<MY_TAG>(.*?)<\/MY_TAG>/s', $str, $matches)

And to remove them, use preg_replace instead.

answered Mar 4, 2010 at 18:22

GumboGumbo

628k106 gold badges767 silver badges837 bronze badges

4

You do not want to use regular expressions for this. A much better solution would be to load your contents into a DOMDocument and work on it using the DOM tree and standard DOM methods:

$document = new DOMDocument();
$document->loadXML('<root/>');
$document->documentElement->appendChild(
    $document->createFragment($myTextWithTags));

$MY_TAGs = $document->getElementsByTagName('MY_TAG');
foreach($MY_TAGs as $MY_TAG)
{
    $xmlContent = $document->saveXML($MY_TAG);
    /* work on $xmlContent here */

    /* as a further example: */
    $ems = $MY_TAG->getElementsByTagName('em');
    foreach($ems as $em)
    {
        $emphazisedText = $em->nodeValue;
        /* do your operations here */
    }
}

answered Mar 4, 2010 at 23:00

KrisKris

39.1k9 gold badges73 silver badges100 bronze badges

Although the only fully correct way to do this is not to use regular expressions, you can get what you want if you accept it won't handle all special cases:

preg_match("/<em[^>]*?>.*?</em>/i", $str, $match);
// Use this only if you aren't worried about nested tags.
// It will handle tags with attributes

And

preg_replace(""/<MY_TAG[^>]*?>.*?</MY_TAG>/i", "", $str);

answered Mar 4, 2010 at 18:29

Php remove tag with content

NicoleNicole

32.2k11 gold badges73 silver badges99 bronze badges

I tested this function, it works for nested tags too, use true/false to exclude/include your tags. Found here: https://www.php.net/manual/en/function.strip-tags.php

<?php
function strip_tags_content($text, $tags = '', $invert = FALSE) {

  preg_match_all('/<(.+?)[\s]*\/?[\s]*>/si', trim($tags), $tags);
  $tags = array_unique($tags[1]);
   
  if(is_array($tags) AND count($tags) > 0) {
    if($invert == FALSE) {
      return preg_replace('@<(?!(?:'. implode('|', $tags) .')\b)(\w+)\b.*?>.*?</\1>@si', '', $text);
    }
    else {
      return preg_replace('@<('. implode('|', $tags) .')\b.*?>.*?</\1>@si', '', $text);
    }
  }
  elseif($invert == FALSE) {
    return preg_replace('@<(\w+)\b.*?>.*?</\1>@si', '', $text);
  }
  return $text;
}




// Sample text:
$text = '<b>sample</b> text with <div>tags</div>';

// Result for:
echo strip_tags_content($text);
// text with

// Result for:
echo strip_tags_content($text, '<b>');
// <b>sample</b> text with

// Result for:
echo strip_tags_content($text, '<b>', TRUE);
// text with <div>tags</div>

answered Jan 2, 2021 at 5:48

Php remove tag with content

proseosocproseosoc

84210 silver badges22 bronze badges