Regular expression to remove all the attributes from the HTML tags | Just Share It

Search This Blog

Regular expression to remove all the attributes from the HTML tags

Following the is the procedure to extract the attributes from the HTML tags.

Create regular expression to replace the certain pattern from the string.
  1. HTML tag consists of {<tag attribute="value">}, let's divide it in to the pattern.
    1. Every tag starts with <.
    2. Tag name defined in HTML, which is string from a to z as HTML is case-insensitive language so consider A to Z and for custom tags we will consider numeric values also like 0 to 9. So the Regular expression will be something like: ([a-zA-Z0-9]*).
    3. Now the after tag name we have either the attributes left or space with end of tag with '/' (eg. <br />), so next regex will be some thing like ([^>]*) which means any characters other than '>'.
    4. And in the last we have end of tag '>'.
  2. So final regular expression will be like as follows: (<(\/?[a-zA-Z0-9]*)([^>]*?)>).
You can test the different language regular expression on here.


Share on Google Plus

About hiteshbal91

This is a short description in the author block about the author. You edit it by entering text in the "Biographical Info" field in the user admin panel.
    Blogger Comment
    Facebook Comment

0 comments:

Post a Comment